Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On 0, Michael O'Donnell <mod+blu at std.com> wrote: > > > > >> I have a directory of 10000+ text files and would like to search > >> for some strings in these files. When I tried using "grep" command > >> with an asterisk, I get the error message somthing to the effect, > >> > >> "File argument list too long" > >> > >> What is the file argument limit for grep? I guess you need the grep > >> source for this. I did not find any information in the man page. > >> > >> Are there any other recommended tools to search through such large > >> list of files? > > > > > >That has nothing to do with grep. It is a limit of > >the shell. One way around this is to use the find command: > > > >Remember that find recursively follows directories, so > >you may want to tell find not to recurse. > > > >Simple example: > > > > tarnhelm.blu.org [11] find . -type f -exec grep "Subba Rao" {} \; -print > > > >or > > > > tarnhelm.blu.org [12] find . -type f -exec grep -l "Subba Rao" {} \; > > > >Example will search all regular files in the current > >directory and subdirectories. Grep will print the text, > >but not fhe file name, and if the text is found, the file > >name is printed on the following line. The second example > >uses the -l option of grep which prints only the file name. > > > The ultimate source of the limitation in question > is the amount of space reserved for argv[] (and > don't forget envp[]) in the kernel's exec module - > it's a hardcoded value that is typically VERY large - > 32 pages (128Kb) in the 2.2.17 kernels, for example. > > (Hmmm, now that you've got me looking at it I might > have found a bug - it appears that the size of all > args and the size of all environment variables are > being individually compared to that limit value, > rather than in aggregate...) > > Anyway, that workaround suggested above is correct > but allow me to suggest a variation that is slightly > more efficient: > > find . -type f -print | xargs -l100 grep -H "Subba Rao" > > This approach uses find only to generate the list > of files, which it simply shoves into the pipeline. > Meanwhile xargs is told to batch up 100 filenames at > a time from that pipeline and pass them all to grep on > the command line; grep has also been told to (via -H) > mention each file's name when it gets a hit. This > drastically reduces the number of times grep needs > to be exec'd so things should go a little faster. > You can experiment with that -l100 parameter, too - > you could conceivably keep bumping it up until you > once again run into the argv[] limit that started > this whole discussion in the first place... > > Thanks for replying. I tried the following solution and it worked and it is much faster than using the plain 'find' command. find <path> -print | xargs -n 500 grep <pattern> Thanks to everyone who replied with a solution! -- Subba Rao subb3 at attglobal.net http://pws.prserv.net/truemax/ - Subcription/unsubscription/info requests: send e-mail with "subscribe", "unsubscribe", or "info" on the first line of the message body to discuss-request at blu.org (Subject line is ignored).
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |