a Java applet programmed by Tim Ma, Dept. of Computer Science, UCLA
see update below
| Run Query Google | |
| Learn About Query Google: | Purpose What you need to run Query Google Specifying words to query Running the program Choosing the target language Details Questions and comments |
Purpose of Query Google
The purpose of Query Google is to permit the linguist to obtain the corpus frequencies of a chosen set of items from the Web, using the well-known Google search engine. The linguist prepares a list of forms for which (s)he wants to know the frequencies, and the program rapidly queries Google, obtaining the frequencies of each form.
What you need to run Query Google
Since Query Google is a Web applet, it should (in principle) be able to run on any type of computer. However, you do need to have the Sun Java Virtual Machine installed on your computer. If you don't have this software installed (test: try running Query Google and see what happens), you can obtain it for free from Sun Microsystems.
You should place the words you want to query in an input file, which should be a plain text file.
To create a plain text file, use the "Save As Text" option in your word processor, or use a simple text editor like Notepad (PC) or SimpleText (Mac) that comes with your computer. Each line of the text file should consist of a single word that you want Query Google to find the frequency of. If you want to find the frequency of a phrase, with more than one word in it, be sure to put quotation marks ("") around the whole phrase.
Here is an example of a legal input file:
Saussure
Jakobson
Bloomfield
Sapir
"Noam Chomsky"Running Query Google
To run Query Google, click on this link or on the Run Query Google link above. Wait for a bit, while the Java programming language gets itself ready. After a few seconds, you will see the program window, which looks like this:
Click on the Choose button. A new window, Choose Input File, will pop up, which will permit you to navigate through the folders on your computer and locate your input file. Once you've found it, click on the file name to choose it. Then click Choose, and you will be returned to the main Query Google window.
Now, click Do Word Count. Query Google will now ask Google to find the frequencies of the words (or phrases) in your file. Currently, on a fast Internet connection, it can covers about 25 words or phrases per second. A "Working..." window will pop up and show you how far Query Google has gotten. When it tells you "Done", click on Close.
The results of Query Google are stored in a text file, in the same folder as your original list of forms. For example, if your input filename is called YourFileName.txt, then the output file will be called Results for YourFileName.txt.
Here is an example. On reading the input file shown above, Query Google (run in July 2003) returned:
Saussure 103000
Jakobson 57600
Bloomfield 803000
Sapir 61800
"Noam Chomsky" 182000
Since these counts are rather large, Google rounded them off to the nearest hundred. For lower numbers, you get the exact value (for example, querying "Leonard Bloomfield" returned exactly 1790.)
Just like Google itself, Query Google permits you to restrict your search to a particular language. You choose this by checking the box you want from the right side of the Query Google interface.
Google will probably be adding more languages in the future. You can find the current set of languages, along with the special Google codes, at this Web page. Look up both the name and code of your target language and type them in the Add Custom Language text box of the Query Google interface.
Query Google provides two methods for automatic Google queries. The recommended method is the one labeled "HTTP" on the Query Google interface (upper left corner). This method is selected for you automatically if you don't make a choice.
If for some reason the "HTTP" method doesn't work, try the alternative "API" method. To do this, you must first obtain a (free) authorization code ("license key"; click on link to get one from Google). With this method, you'll have to enter the license key in the window near the upper left corner of the interface. Query Google will remember this key the next time you use it, so long as you answer "yes" to the question "Save settings?" when you exit the program.
The "API" method currently limits you to 1000 queries per day.
Please direct questions and comments to Bruce Hayes at:
I'm not sure how many searches you can perform with this utility. In April 2007 I tried a rather lengthy search and was (very politely) blocked by Google. A similar search today, however, worked just fine.
Various academics I know who have contacts at Google have suggested to me that if you do get blocked, it would be worth your while to make courteous inquiries to the Google company, indicating that your purpose is research; and that such queries would be likely to be heeded sympathetically. So if you find that the size of the search you want to conduct exceeds what Google is currently permitting, I suggest you pursue this route. I would appreciate hearing what you learn; see email below.
--Bruce Hayes, Department of Linguistics, UCLA, February 1, 2007.
Back to Bruce Hayes's Home Page