TIL: Firefox Bookmarks Export - HTML to PLAIN File

Oct 31, 2020

Today I Learned

Long long time ago, I started using browser bookmarks and the burden of carrying around them system to system has been a pain. I hope, there are people like me, want to carry forward the legacy bookmarks to beyond.

I’m not going to be giving the steps for exporting the bookmarks from firefox, as they are already straight forward. Still, if somebody needs help, please look - here .

SED to the rescue

SED - stream editor, popular choice of most of the sysadmins and hardcore developers. Once, you’ve exported, you’d have a html file in any destination based on your choice.

In your terminal, use the following script to filter out only the href part from all those nasty large links:

sed -r -e '/.* HREF=.*$/!d' -e 's/.* HREF="(.*)\" ADD_DATE.*$/\1/g' {INPUT FILE NAME}.html | uniq > {OUTPUT FILE NAME}.txt

Let me breakdown the command

Now, the output file can be managed with any VCS out there.

Improvements (December, 2020):

parallel --pipepart -a {INPUT FILE NAME}.html -j4 --roundrobin \
sed\ -r\ -e\ \'/.\*\ HREF\=\"\(.\*\)\\\"\ ADD_DATE.\*\$/\!d\'\ 
         -e\ \'s/.\*\ HREF\=\"\(.\*\)\\\"\ ADD_DATE.\*.\*\$/\\1/g\' | \
uniq > {OUTPUT FILE NAME}.txt

After sometime, browser took quite sometime to export the bookmarks.html file which was around 50mb and sed took about 13 to 15 secs to process the whole html file. The improvement requires a new package named parallel from GNU. (sudo apt install parallel or directly download from the site)

Command breakdown:

$ parallel --shellquote
parallel: Warning: Input is read from the terminal. You either know what you
parallel: Warning: are doing (in which case: YOU ARE AWESOME!) or you forgot
parallel: Warning: ::: or :::: or to pipe data into parallel. If so
parallel: Warning: consider going through the tutorial: man parallel_tutorial
parallel: Warning: Press CTRL-D to exit.
{PASTE the SED command from the above (before improvements)}
{SHELL QUOTED OUTPUT STRING}

After using parallel, the time was reduced to 6 to 7 secs which is half the actual time taken.


   til (4) , sed (1) , firefox (1) , bookmarks (1) , linux (5) , unix (5) , script (3) , bash (1) , sh (1) , parallel (1)