Friday 26 April 2013

Is scraping legal?

Lots of people, when they hear about ScraperWiki, ask “is scraping legal? how can you build a business off that?”. Usually to follow up by saying “we do it in our company, but we would never tell anyone”.

This is strange to us, as we have come from a world of good scraping. Taking Government data, and making it easier for people to use for things that benefit all of society. We’re in favour of that kind of scraping.

It’s obviously a spectrum. At the other extreme, the most evil scraping would be to steal content that somebody else sells, and then to republish it at harm to their business. We’re against that kind of scraping.

It’s not scraping itself which is good or bad, or legal or illegal, but the circumstances in which you’re doing it.

We’ve written up in full our policy about the legality, it’s in our FAQ under ‘What’s your policy on what’s legal to scrape?‘. Lots of details about robots.txt and take down notices, and what is our and your legal responsibility.

Finally, ScraperWiki isn’t just about scraping.

We’re a data hub, and you need to get data into a data hub. As well as scraping, lots of people make API calls to do that on ScraperWiki, or download their own files from their own servers.

This is much more profound than it sounds – when you are using data for a new purpose, even if it is already structured, you still need to get it and convert it to your new needs. How you do that is a detail that depends on the circumstances.

The difference between parsing HTML web pages, and using a JSON REST API is surprisingly small. As an example, Thomas scraped EventBrite even though it has an API (see the post at the end of that thread by Ryan who works at EventBrite!), because it was easier at the time for him.

What matters is getting the data, and converting it into a form where it can do something useful for the world. And doing that legally. Whether you’re using Nokogiri or Nestful.

Source: http://blog.scraperwiki.com/2012/04/02/is-scraping-legal/

Note:

Delta Ray is experienced web scraping consultant and writes articles on Web Screen Scraping, Scraping A Website, Extract Data From Website, Website Screen Scraping and Scrape A Website etc.

India should focus on data mining, analysis of tax admin: Parthasarathi Shome

 NEW DELHI: India should strengthen the Large Tax Payers Unit (LTUs) and focus on data mining and analysis of tax administration, expert Parthasarathi Shome said today.

"There is need to strengthen LTUs and focus on data mining and analysis of tax administration," Shome who is also a professor at the Delhi based-think tank Indian Council for Research on International Economic Relations (ICRIER) said.

 LTUs are self-contained tax administration offices under the Department of Revenue acting as a single window clearance point for all matters relating to central excise, income tax/corporate tax and service tax.

Entities would be able to file their excise return, direct taxes returns and service tax return at such LTUs and for all practical purposes will be assessed to all these taxes at these LTUs.

Shome, who is the head of the panel on GAAR set up by Prime Minister, was addressing the delegates at a seminar organised jointly by FICCI and International Chamber of Commerce (ICC) here.

Finance Minister P Chidambaram had earlier proposed setting up LTUs in Budget 2005-06, with a view to reduce the tax compliance costs and delays for the LTUs.

Currently, LTUs function in Bangalore, Chennai, Mumbai and Delhi, but the scheme has not been successful as not many taxpayers volunteer for fear of closer scrutiny.

The scheme entailed opening up of a single window facilitation centre for all large entities paying excise duty, corporate tax/income tax and service tax.

It was supposed to cover those units who paid excise duty or service tax of Rs 5 crore or more or a corporation tax of Rs 10 crore or plus.

Recently, Chidambaram had asked officers to focus on low taxpaying sectors to add an additional Rs 30,000 crore to the revenue kitty.

The minister had said though the applicable tax rate for corporations was 30 per cent, the effective tax rate was just 24 per cent and even lower in some sectors.

He had further said instead of randomly going after taxpayers, most of whom duly paid taxes, the revenue department should focus on those below-average sectors to see whether the average could be raised for greater collection.

Source: http://articles.economictimes.indiatimes.com/2012-11-22/news/35300853_1_service-tax-ltus-excise

Note:

Delta Ray is experienced web scraping consultant and writes articles on Web Screen Scraping, Scraping A Website, Extract Data From Website, Website Screen Scraping and Scrape A Website etc.