Experiments with web scraping

Few weeks ago I came across web scraping. Found the subject intriguing with numerous life applications.

I researched a bit and there are a lot of applications and frameworks one can use to quickly make an app that can scrape data off websites.

Ones that I tried and experimented a little are: Continue reading “Experiments with web scraping”

Advertisements

Quick back up website

To quickly backup your public folder use the below:

tar -zcvf /home/protected/public-date +%Y%m%d.tar.gz /home/public/

What the above will do is, compress the files in home/public and place the archive in /home/protected folder.

If you’re using Amazon EC2, the below will help:

tar -zcvf /var/www/-date +%Y%m%d.tar.gz /var/www/html
This will compress the files in /var/www/html and place the archive in /var/www/ folder.

How to enable gzip on Amazon EC2 Instance

I am still tweaking small things on the Amazon EC2 server that is hosting my site. One of the things that I did not do immediately is enable gzip compression of all the site data when it is served to a browser. What this does is compress all the files down before they are pushed across all those tubes that make up the internet, and the browser then decompresses the files on the other side. Continue reading “How to enable gzip on Amazon EC2 Instance”