My team already makes fun of me about my little shell-script-crusade.
7 command-line tools for data science (7cltfds)
When coming across 7 command-line tools for data science I liked the article. Installing all the different tools was bit of a pain as it required python and nodejs to be setup. I hesitated to install the tools and start playing as I didn’t have a cool idea or problem to solve.
Leaving the comfort zone
One day, when turning to lead-qualification for our sales-team in Brazil, I looked at a website, which offered a lot of content about online-shops.
Then I got lost in refactoring of throw-away code because it was so ugly. I wasn’t happy with maintaining another project to do future lead-qualification of other sources, this smelled to me.
So I thought about how to shell-script it, now I had a reason to come back to installing the tools mentioned in 7cltfds.
Instead of using the pagination of the site, I utilized the search as an api to search for all shops ^^.
Although the shell-script looks like a mess, creating the shell script line by line with temporary results in text files (a very nice feedback-loop), it’s been straight forward to create.
Knowledge transfer of shell scripts
The best side-effect is, I’ve been able to go show the shell-script to our business intelligence. Besides scraping future lead-qualification sites, BI had a huge learning curve to solve recurring tasks when generating reports from other apis learnign more shell commands.
Learn more shell commands
So what are the next steps to shell-script mastery?
Get familiar with the main operators
| # the pipe <,> # redirecting in and out () # firing off subshells
while if for
cat # stream content find # find files cut # split functionality grep # search xargs # smash multiple lines into one and map over the elements wc # word count sort sed # gsub awk # inline editor, crazy shit
Additional key tools
curl xml2json # nice to access attributes out of html documents jq # key utility to manipulate/view json scrape # command line scraper json2csv # to feed graphing libraries
What to learn next
ls /usr/bin | less # yields few commands :)