Well, it turns out that I have a rather huge collection of CD's. Some bought, some inherited and so on - all legit, no piracy. I ripped them all to full quality .flac files over the course of a couple weeks (we're talking huge here), but while those play fine on the computer and are "only" about 150gb total, I wanted pretty-good mp3's of it all for use in automobiles and so on. Well, there is a nice script out there you can put in .gnome2/nautilus scripts that converts things, but this problem was well past its speed, so to speak. It just takes too much human intervention, can't do a whole slew of subdirectories, handle mixed input formats (some things I only had as mp3, like some very old music I'd done myself) and so on. Soooo....we find that sometimes it's quicker to write code to automate something than to do it manually, even if you're only going to do that "something" one time - convert something like 7k files of mixed input types to a decent quality (average 192kbs) mp3's in this case.
Here's what I began with:
Soooo, bash ain't my cup of tea, really. Perl is more to my liking. So, I wrote a perl script to do the subset of this that I want/need, and optimized it in a couple of ways.
Rather than fool with threading or forking, I simply divided my main CDRips directory in two, and hardcoded two different versions of source dir into a couple copies of my perl script - allowing two to run at once on a two-core machine I have here that gives best mips/watt - and has a SSD I can use for temp files; I am willing to beat it up for this one job - using it for temp .wav files converted from .flac so lame can do the .wav to mp3 on them.
I worked it out based on some initial tests. A single thread version of this was going to take well over 40 hours - at a 40-50x speed faster than real time(!). Well, on solar power with uncertain spring weather, and not wanting to make it so I could stop and restart the program, I decided to go for raw speed. I divided my source CDRips directory into CDRipsA and CDRipsN (alphabetically, they are about equal in size) and made two versions of the perl script to use the two hardcoded (yuck, I know, but adding a gui etc would have been more work and I didn't need it - this is a one-shot) to convert it all into another directory called "MP3s".
And here they are. No differences other than a different source dir name and temp file name for each.
They are screaming along right now (6:30 on 4/20/2013) and ought to be finished by tomorrow afternoon. Did I say huge? This is pegging out both CPU's in a core duo running 3.12 ghz...using nearly no memory, a SSD for the two temp files and a spinner for input and output. Zowie...I used most of the tricks to make this go fast, even if it cut only a second per conversion, I did it. I'm using variable bit rate here, with an average of 192kbs or so, the latest/greatest lame settings and all that. I could have made it maybe .5 sec per conversion quicker if I eliminated all the status messages, but...at that point I wanted to just be able to see it working, so I just slowed those down to 1/second.
And here is the result. Turns out I could have done it just a little better, as commented in there - lame now supports tagging better so I didn't have to copy the old way, but frankly, when it takes a bunch of seconds to convert, and an un-percievable time to re-tag (copy the tags from the original .flac into variables, and then into the resulting mp3 file) - it wasn't worth experimenting any further.
So, sometime tomorrow afternoon, I'll have ~~ 150gb compressed roughly 8::1 to burn on USB sticks for my cars, or onto CD's for my truck. That's a pretty nice compression ratio and will make my life a lot simpler for on-road entertainment. Heck, that's only 18.5 gb estimated - if I ditch a few of the less worthwhile titles, I should be able to fit this all on one stick - or about 4 dvds....that's real progress and will unclutter my vehicles a good deal.
Motto - do something cool every day. Then at some point you find you're surrounded by uber-coolness. Try it.
No pix, no attention span. So here's a screen shot of this maxing out my main server (not my fastest machine, just the most efficient one so it can be on more often - on solar power).