A dark data startup you haven’t heard of is spilling secrets to Starbucks, McDonald’s

18 Aug, 2017: There’s a startup in India quietly crunching enormous amounts of dark data. Quantta Analytics is digesting unstructured data – text messages, documents, email, video and audio files, and still images. It also ingests stuff from the deep web – everything online that is not indexed by search engines, including anonymous, inaccessible sites known as the “dark web,” and spilling its secrets to about a hundred entities paying for it.

You aren’t likely to have heard of Quantta as even its website says too little. The site almost makes the company look like some kind of family-run business with siblings Ritesh Bawri, Malvika Bawri, and Vinay Bawri, who used to run a multi-million dollar cement company, at the helm.

The rest of Quantta’s 20-member team – including mathematicians, statisticians, and engineers from MIT, Harvard University, University of Michigan, University of Maryland, Indian Institute of Technology (IIT), and Indian Statistical Institute (ISI), according to the founders – stay behind the veil. They are using computer systems modeled on the human brain and nervous system – neural networks – to understand and interpret dark data while it’s being spewed out – in real time – and combining it with behavioral psychology to decipher and predict human behavior.

For now, it is mostly large financial institutions like State Bank of India (SBI), Kotak, and Fullerton paying Quantta for its intelligence. There are some from other industries like retail, hospitality, healthcare, energy, and food as well – like McDonald’s and Starbucks. But in the future, Indian intelligence agencies could pay Quantta for aid in covert operations. It is, after all, modeled after the secretive data giant in the US Palantir, according to founder Ritesh Bawri.

Palantir has been lending tech muscle to intelligence outfits like the CIA, Homeland Security, military clients, police departments, and large financial institutions in the US. The CIA was an early investor in the company, founded in 2004 by a group of investors and technologists including Peter Thiel. Unconfirmed reports have said Palantir’s technology was used in the intelligence operation that killed Osama bin Laden.

Bawri gives me an example of what Quantta can do: “Imagine a voice conversation taking places between me and my Pakistani handler. I am chatting with him using an unregistered device and he is giving me instructions on what to do, discussing terror activities. In a perfect world, you should be able to suck in that conversation, transcribe it into text on top of which you can do a keyword search and say what these guys were talking about.”

 

The ROV’s suction device is on as it makes an approach to capture “Oliver,” an octopus floating just a few feet away. Image courtesy of The Hidden Ocean, Arctic 2005 Exploration.

 

A remotely operated underwater vehicle is about to capture “Oliver,” an octopus floating just a few feet away. Photo credit: The Hidden Ocean, Arctic 2005 Exploration.

“The use-cases for their [Quantta’s] product are limitless,” says Rohan Malhotra, co-founder of Investopad, who introduced the startup to me.

Quantta recently raised an undisclosed amount of funding from unnamed investors – “they are all renowned entrepreneurs and investors from across India and Silicon Valley,” Bawri says. This was a seed round of external funding for Quantta, which the Bawri siblings were bootstrapping since 2014.

 

A lottery jackpot

A Google search on any topic would give just 0.03 percent of the information that exists online. That’s because its search engine cannot access the deep web – a place hidden behind paywalls you can access only with passwords and special software. The deep web is at least 500 times larger than the surface web – the stuff most of us use every day.

Google throws up about 16 percent of the surface web, according to a study published in NaturePopular Science bills it “like fishing in the top two feet of the ocean—you miss the virtual Mariana Trench below.”

This was a problem screaming for solutions, and technologists around the world have begun to work on it.

In its latest Tech Trends 2017 report, Deloitte compares dark data to “a lottery jackpot.” The digital universe – comprising the data we create and copy annually – is expected to reach 44 zettabytes by 2020 and will contain nearly as many digital bits as there are stars in the universe, according to the report, which argues that it is finally the time for companies to utilize dark analytics.

 

 

Until recently, the tools to do this were limited. But the Deloitte report cites several new search tools that can help mine dark data.

  • Deep Web Technologies’ search tools – used by federal scientific agencies in the US as well as many academic and corporate organizations – retrieve and analyze data that is not accessible to standard search engines.
  • Hidden Web Exposer, prototype engine built in Stanford University, scrapes the deep web for information using a task-specific, human-assisted approach.
  • Infoplease, PubMed, and the University of California’s Infomine are other publicly accessible search engines for the deep web.

A few months ago, Apple bought Lattice Data, an analytics company that uses artificial intelligence to convert dark data into usable data, for US$200 million.

Bawri gives me an example of how dark data analytics could prove useful for a consumer company. Say Nike has 200 stores in different parts of India. And it has data about what is happening in each of these stores. How much did the customers buy, what they bought, how much time did they spend at the store, what time did they buy, and so on are with Nike. “If Nike gives that data to me, I will add data about what else is happening in the world around those Nike stores. I can tell the company that around this store, where you clocked sales of a million Indian rupees, are potentially 500 customers who have not bought from you so far,” Bawri says.

 

The glue

Unlike most founders of deep tech startups, Ritesh Bawri is not a technologist. He holds an MBA from the University of Michigan and was the managing director of Calcom Cement until 2012 when the Bawri family handed over controlling stake in the 2.1-million-ton manufacturing unit to Dalmia Cement.

Ritesh, Malvika, and Vinay Bawri began working with technologists in Silicon Valley in 1999, particularly in data science and search. They have filed for six patents for tech inventions in the US.

“We have been able to attract extremely smart people to join us. I am just the glue, the anchor, that brings everyone together to solve a difficult problem. I don’t have a pure tech background, but I think what we (Bawris) bring to the table is what really works in a business environment,” Ritesh Bawri says.

The Indian government recently set up an official committee under its broadcasting unit Prasar Bharati to launch a global digital news platform. Bawri is in the committee.

“I have an understanding of what value-add is possible in a business environment,” he says.

The thesis for Quantta, Bawri says, was fairly simple: Data is going to be the big driver of efficiency in understanding and predicting human behavior. They wanted to build the infrastructure to do that.

“We saw ourselves as people who could aggregate data from diverse sources, combine them in a homogeneous way, stack them up geographically – so that I can see where every ATM, bank, school, hospital is located, and then make the dark data around it useful,” Ritesh explains.

This is where Sagar Mysorekar, an expert in geographic information system (GIS) – a computer system designed to store, retrieve, manage, display, and analyze all types of geographic data – comes in. He worked with GIS in the US for about 12 years before joining Quantta and moving to Bangalore.

A large number of business decisions are geographically driven. Take for example, a bank’s decision on where to open a new branch. “Is there accessibility? Are there enough people who will consume the bank’s services? Is there good enough infrastructure? These are all questions that are geographical data driven,” Mysorekar tells me.

 

 

Not many analytics companies bring dark data to the table themselves. Quantta does. It has 300 different data pipes flowing into its system, according to Bawri. “Every day, we add additional data pipes. It is a complex task because data doesn’t come in a homogeneous unit. So we had to build tools that would take in very vast amounts of data and aggregate it.”

The next step is to make this data useful. For that, you have to understand and interpret it. That is, you quantify it. “Take tuberculosis (TB) in the city of Kolkata, for example. Say, there are 100 incidents of TB in Baliganj, 30 incidents in Salt Lake, and so on. I can then map it against water bodies, restaurants, malls, and so on in the city. Then I can start deriving meaning by looking at all the different data sets stacked together and see the cause for TB. That quantification happens through algorithms that we built,” Bawri says.

 

I am not Amazon

Not everything has been smooth sailing for Bawri and team, especially in the beginning.

“When I go to a customer, the first thing they say is that you have no history. You have not demonstrated that you add value,” Bawri recalls.

So he came up with a hack to tackle that googly. “We would say: go back three years and tell us about a few decisions you made in that period. Then give us the data you have of the first two years. Based on it we will make predictions about the third year. You have data about what actually happened. So you can see if we got our predictions broadly right.”

It took them a year-and-a-half to bag their first customer – microfinance company Ujiwan. This despite Bawri having worked with Ujiwan earlier.

“Every day, we were knocking on the doors of potential customers. And each time we took their feedback, good and bad. Even if they said, ‘you are stupid’ or ‘you don’t really know what you are saying’. No problem, we learn from it,” Bawri recalls.

“Customers would say, hey, it is all nice to hear but I don’t really want this. Or, give it to me for free. Or, you should pay me because I am giving you my data. In the beginning, we got more rejections than acceptances.”

The challenges have not ended there.

“There may be about 60 billion data points already flowing into the Quantta system. And we are trying to integrate more data streams to it. The technological challenges of doing that in a cost efficient manner is not small. On top of it, you are trying to predict human behavior,” Bawri says.

There is also “a huge friction in trying to scale a business like this,” he says. “Talent is always a problem. To get smart people to come join a young startup is difficult. I am not Amazon.”

Quantta flew under the radar for a while, notching up the intelligence of its tech stack. It has had about 100 paying clients and was cash-flow positive without any external capital. (Cash flow is different from net income which may have to account for expenses on debts, depreciation, and son on in the annual statement.)

This year, Bawri felt it was finally time to press on the pedal and that’s why Quantta raised a round of seed capital a few months ago.

The decision to go for funding was driven by two reasons. One is, obviously, to grow the company. Quantta wants to go nation-wide and have data from every nook and corner of India. That will need a deeper pocket. “Secondly, what we are trying to do is much bigger than what the existing team is capable of doing. That’s where investors with deep networks and very critical minds come in. They have the ability to sort of question and challenge the assumptions that we make at Quantta,” Bawri says.

Investor Anmol Nayyar joined Quantta after the funding round. According to Bawri, Anmol and other investors who came on board brought with them an ability to think much bigger than what the founders had in mind. “I don’t mean by 20 percent bigger, but 3000 percent bigger than what we were thinking.”

Quantta is now setting up Quantta Labs in Silicon Valley as well.

 

Covered by :-