On Mon, 2004-04-12 at 18:29, Paul Winkler wrote:
FYI, I'm still planning to implement my own
proposal which has
been discussed quite a lot in the L-A-U archives.
I do somewhat similar sites for a living. It just needs me to block out a
chunk of time (1 or 2 weekends) to bang it out.
Hi Paul,
I'd like to assist so I've written a little script to automatically
extract all the links from
linux-sound.org.
It depends on Ruby, wget, lynx and sed.
The output is tab separated csv (tsv) containing three fields per row:
text, urls and category.
Some of the <li>s contain more than one url. For these the urls are
separated by blanks ' '.
The category is either the title (<h3>) of the subpage or the text which
belongs to the list item containing the current (<ul>).
The script expects at least one url from
linux-sound.org as arguments
(one of the subpages). So you'll also need a working internet
connection.
A '-H' prints an additional csv header.
I also attached a bash script to extract all the subpages from
linux-sound.org.
If you have any problems with this script I can send you all the data
off list.
HTH,
Jan
P.S. Follow-up to LAU?