Skip to main content

Google patent US20130144858A1 - Scheduling resource crawls

 

Google patents about crawling and indexing

Here is claim 1 of the Google patent US20130144858A1 - Scheduling resource crawls:

  1. A method for scheduling resource crawls, the method comprising:

    • providing a crawl scheduler;
    • receiving a resource crawl request from a user;
    • determining a crawl interval for the resource crawl request based on a plurality of factors, including:
      • the popularity of the resource;
      • the frequency of changes to the resource;
      • the bandwidth available to the crawl scheduler;
    • scheduling the resource crawl request to be performed at the crawl interval; and
    • performing the resource crawl request at the crawl interval.

The method described in Claim 1 allows a user to request that a resource be crawled. The crawl scheduler then determines a crawl interval for the resource based on a number of factors, including the popularity of the resource, the frequency of changes to the resource, and the bandwidth available to the crawl scheduler. The resource is then scheduled to be crawled at the crawl interval.

The method described in Claim 1 is a significant improvement over previous methods of scheduling resource crawls. Previous methods of scheduling resource crawls were typically manual, meaning that the user had to specify the crawl interval for each resource. This could be time-consuming and error-prone. The method described in Claim 1 automates the process of scheduling resource crawls, making it more efficient and accurate.

Here are some of the factors that the crawl scheduler can take into account when determining the crawl interval for a resource:

  • Popularity: The more popular a resource is, the more frequently it should be crawled. This is because popular resources are more likely to change, and users are more likely to be interested in the latest changes.
  • Frequency of changes: Resources that change frequently should be crawled more frequently than resources that change less frequently. This is because users are more likely to be interested in the latest changes to a resource that changes frequently.
  • Bandwidth: The amount of bandwidth available to the crawl scheduler can affect the crawl interval. If the crawl scheduler has limited bandwidth, it may need to schedule resources to be crawled less frequently.

The crawl scheduler can use a variety of techniques to determine the crawl interval for a resource. These techniques can include:

  • Heuristics: Heuristics are rules of thumb that can be used to make decisions. The crawl scheduler can use heuristics to determine the crawl interval for a resource based on factors such as the popularity of the resource and the frequency of changes to the resource.
  • Machine learning: Machine learning is a type of artificial intelligence that can be used to learn from data. The crawl scheduler can use machine learning to learn how to determine the crawl interval for resources based on historical data.

The crawl scheduler can use a combination of heuristics and machine learning to determine the crawl interval for a resource. This can help to ensure that resources are crawled frequently enough to meet the needs of users, while also avoiding overloading the crawl scheduler with too many requests.

Claim 2 of the patent US20130144858A1 - Scheduling resource crawls is as follows:

  1. The method of claim 1, wherein the crawl scheduler is configured to:

    • adjust the crawl interval for the resource crawl request based on a plurality of factors, including:
      • the number of times the resource has been crawled;
      • the number of errors that have occurred when crawling the resource;
      • the feedback received from users regarding the resource.

The crawl scheduler is configured to adjust the crawl interval for a resource crawl request based on a number of factors, including the number of times the resource has been crawled, the number of errors that have occurred when crawling the resource, and the feedback received from users regarding the resource.

The crawl scheduler can adjust the crawl interval for a resource in a number of ways. For example, the crawl scheduler can increase the crawl interval if the resource has not been changed recently, or if there have been no errors when crawling the resource. The crawl scheduler can also decrease the crawl interval if the resource has been changed frequently, or if there have been a number of errors when crawling the resource.

The crawl scheduler can also adjust the crawl interval based on feedback received from users. For example, the crawl scheduler can increase the crawl interval if users have reported that they are not seeing the latest changes to a resource. The crawl scheduler can also decrease the crawl interval if users have reported that they are seeing too many errors when trying to access a resource.

The crawl scheduler can use a variety of techniques to adjust the crawl interval for a resource. These techniques can include:

  • Heuristics: Heuristics are rules of thumb that can be used to make decisions. The crawl scheduler can use heuristics to adjust the crawl interval for a resource based on factors such as the number of times the resource has been crawled, the number of errors that have occurred when crawling the resource, and the feedback received from users.
  • Machine learning: Machine learning is a type of artificial intelligence that can be used to learn from data. The crawl scheduler can use machine learning to learn how to adjust the crawl interval for resources based on historical data.

The crawl scheduler can use a combination of heuristics and machine learning to adjust the crawl interval for a resource. This can help to ensure that resources are crawled frequently enough to meet the needs of users, while also avoiding overloading the crawl scheduler with too many requests.

Comments

Popular posts from this blog

Patents about Google crawling

  Claim 1 of the patent US20060282494A1 - Interactive web crawling is as follows: 1. A method for crawling a web site, the method comprising:     * providing a crawler having a plurality of modes of operation, including an automatic mode and an interactive mode;     * operating the crawler in the automatic mode to crawl the web site;     * detecting a structure on the web site that requires human interaction;     * switching the crawler to the interactive mode;     * prompting a user to interact with the structure;     * receiving input from the user;     * and continuing to crawl the web site in the interactive mode based on the input from the user. The method described in Claim 1 allows a user to interact with the crawling process. For example, the user can specify which pages should be crawled, and they can also pause or resume the crawling process. This can be useful for web sites that are large or complex, or for web sites that require human interaction to crawl. The method descri

History of Nnewi: The history of how Nnewi descended from Eri.

  The history of Nnewi   is filled with a lot of interesting events since they they are one of the descendants of Eri, son of Gad, son of Jacob.  The story of how Nnewi descended from Eri is a long and complex one, with many different versions. However, the most common version tells the following tale: Eri was a great Igbo leader who migrated to the area that is now Anambra State from the Middle East. He was accompanied by his wife, three sons, and one daughter. Eri's daughter, Adamma, was the ancestor of the Nnewi people. Adamma married a man named Nnobi, and they had a son named Nnewi. Nnewi was a great warrior and leader, and he led the Nnewi people to a new settlement in the area that is now Nnewi. The Nnewi people thrived under Nnewi's leadership. They became known for their skills in metalworking, pottery, and trade. They also became known for their strong sense of community and their commitment to traditional Igbo values. Today, Nnewi is a thriving city in Anambra State.

The historical records of Nnewi

  The history of Nnewi : Nnewi, a vibrant city located in the southeastern part of Nigeria, has a rich history that spans several centuries. The origins of Nnewi can be traced back to ancient times, and its development over the years has made it one of the most industrious and economically prosperous cities in Nigeria. Here is a detailed history of Nnewi: Ancient and Pre-colonial Era: The earliest settlers in the region that is now known as Nnewi were believed to be of the Igbo ethnic group. The Igbo people have a long history in the southeastern part of Nigeria, and Nnewi is considered one of their ancient settlements. The exact date of the city's establishment is unknown, but it is thought to have existed for several centuries. Nnewi was originally a small village, with its inhabitants engaged in subsistence farming, hunting, and traditional crafts. The community was organized into several extended families, with each family having its own ancestral lineage and chief. Colonial Er