simple regex question
John Westcott IV
John.Westcott at tufts.edu
Thu Jan 11 16:14:07 EST 2007
To go along with the command line, after you cut -d':' -f3 you should be
able to add "| cut -c1-20" to get only 20 chars of field 3.
echo
"d5:filesd20:xxxxxxxxxxxxxxxxxxxxd8:completei2e10:downloadedi0e10:incompletei4e"
| cut -d':' -f3 | cut -c1-20
-John
Danny wrote:
> Quoting Dwight E Chadbourne <dwighte.chadbourne at stopandshop.com>:
>
>> Hi all. I want the 20 digit hash in this text.
>>
>> d5:filesd20:xxxxxxxxxxxxxxxxxxxxd8:completei2e10:downloadedi0e10:incompletei4e
>>
>> 4:name12:xxxxxxxxxxxxee5:flagsd20:min_request_intervali3600eee
>>
>> How do I get only the xxxxxxxxxxxxxxxxxxxx and not the preceding
>> identifier?
>
> I can't give a definitive answer without knowing if there's either
> always 20 "x"s, or if you just want the full text in between the
> second and third ':'.
>
> So, using regex:
>
> 1) assuming 20 characters, immediately following after the second ':'
> ^([^:]*:){2}(.{20}).*$
> This will set your desired value in the backreference #2, so if you
> were using perl (assuming your original content was in '$string')
> $string =~ s/^([^:]+:){2}(.{20}).*$/\2/;
>
> 2) The full text between the second and third ':'
> ^([^:]*:){2}([^:]*):.*$
> Again, this will put everything between the second and third ':' into
> backreferece #2, to be used in the same fashion as the previous example.
>
> One of the other responders mentioned using 'awk' via the command-line
> to isolate the content between the second and third ':'. You could
> use 'cut' to accomplish the same thing.
>
> echo
> "d5:filesd20:xxxxxxxxxxxxxxxxxxxxd8:completei2e10:downloadedi0e10:incompletei4e"
> | cut -d':' -f3
>
> This specifies that the field delimiter is ':' and that you want the
> third field isolated.
>
> I hope this was helpful,
> -Danny Robert
> daniel.robert at acm.org
>
> P.S.: This is my first post to this user list having moved to boston
> about a year ago. Just thought I'd say "hi".
>
> --This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Discuss mailing list
> Discuss at blu.org
> http://lists.blu.org/mailman/listinfo/discuss
>
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Discuss
mailing list