Category Archives: Data Science

The video for my talk at Balance Payments a few weeks ago is now available on YouTube.

In order to introduce my recently published ebook‘s content in a new way, I focused on the balancing act needed when managing risk, and cases in which this balance isn’t found.

Catch it here:

Smart, non-techie people sought for new project

2 Replies

You read it right! I’m looking into a new and exciting project in financial services, and it requires 1-2 people that are not engineers but can handle technology, operationally minded but not necessarily with years of experience. Here’s something I wrote a few years ago that captures the kind of people I enjoy working with (this is not for PayPal!, see after the quote):

What I’m looking for is results driven, quick thinking do-it-alls who want to be involved with new products, markets and risk challenges within Paypal. You should have the passion for consuming a lot of data and information, be able to learn quickly and identify and define trends in concise terms. You should be analytical and with a quantitative approach but not a data cruncher without any understanding of the big picture – we are playing at all fronts. Know or be able to learn how to drive processes through other people and organizations; working in ambiguous situations and coping with change is a must, as well as an ever changing operating rhythm. This is not your classic 9 to 5 and I’m not your classic 9 to 5 manager.

Experience is not a must (=graduates are also encouraged to apply), definitely not previous experience in risk management. However, please be an avid internet user, preferably a gamer in your past or present. Some security experience or tech savvy is a big plus – don’t get intimidated by developers, architects and tech talk. Impress me by having interesting hobbies out of work that you maintain although you are an aggressive achiever, and by having vast general knowledge (as in: you shout answers at “who wants to be a millionaire” while watching it on TV).

This is an excellent opportunity to be part of a founding team of a new startup that I think is very interesting, and to get a glimpse into the method and ideas that made FraudSciences, Analyzd, Signifyd (and hopefully this one as well) such a lucrative deal for investors, customers and acquiring corporations. This is also an opportunity for extremely smart people who aren’t engineers and are looking for a way into startups and don’t know how. Refer your best friends 😉

Please help me spread the word! Contact me directly for details.

NOTE: local SF Bay area folks highly preferred.

Forget Big Data

4 Replies

These are the slides from a talk I gave last week. The gist of it: “Big Data” in Fraud and Risk prevention for payments won’t suffice, and must be augmented by domain experts (including a few notes about reasons for that, a bit about domain experts, and some real life examples). Nothing new for readers of this blog, but you may find the slides or wording helpful.

Even good things come to an end – Why I’m leaving Klarna

2 Replies

I wasn’t sure I was going to write this post, but I’ve had several of these conversations in the past few weeks and putting my thoughts here is a good kind of closure.

The Analyzd team joined Klarna in 2011 to bring our way of doing Risk and Product to a company that was starting its international push. Klarna simplified payments in a way I found appealing but was easy to fraud and abuse, and what we brought to the table made a difference. It was also my first attempt at testing my philosophy about Risk completely on my own – not as a group leader in a small startup or a huge behemoth like PayPal but a company in hyper growth. Boy, was I in for a ride!

It’s been two years. I learned a ton – first, that this thing really worked. The amazing team and I managed to bring Klarna to new levels of fraud and risk prevention as well as technical prowess, introducing technical and predictive innovation that allowed for some of the lowest rejection and default rates in the industry on an annual payment volume of close to $2.5 Billion in online, real-time short term credit. Klarna Risk is not only a great place to work at and a well-oiled machine delivering results but also a great place to be from, and I expect many of the young leaders to move on to lead teams in the coming 5 years, much like the FraudSciences team did. I also learned what it means to be part of a hyper-growth company, raise money and work with some of the most impressive VCs in Financial Services and in general, work with regulators and traditional finance companies and many other things.

I also learned something else – I like building and inventing, much less so the exec type, and with the team’s maturation they needed me less. After much deliberation I decided that it’s time for me to go a build something new, and let the professional executives run the show. I’m leaving behind a big, mature and strong team led by some of the most talented people I had the fortune to work with and a company with a bright future, led by smart and capable leaders. I will continue to be bullish about Klarna’s ongoing success and help them out as much as I can, while I go back to square one and build something new, hopefully again something helpful.

I am slowly transitioning out of my role, and will stick around through the end of 2012, but come 2013 I am planning to be out and about. More info on my next steps to come in the near future. Stay tuned.

Feature companies in risk and fraud (or: enterprise and consumer startups are different)

Leave a reply

David Pakman quoted Tim Armstrong today: “I’ve seen too many feature companies get hot, raise too much $ and get way too overvalued.”

It is interesting to see that this is true in various markets. The trend is extremely obvious in payments/security as well, and is a by product of the boom in seed funcing for consumer startups. I wrote extensively about payments startups and how important it is to know where you are in the value chain. The thing is that I see the same in risk and fraud detection, where you’d expect the need for complete and complex products to be obvious.

Indeed, the concepts of consumer and lean startups trickled into enterprise; as a result, small 2-3 person teams are trying to build rudimentary detection mechanisms, mostly based on “social data” (a euphemism for opt-in Connect or scraping Facebook directly) and expect to position themselves in the market as serious providers in a short time frame. This is far from a reasonable expectation, however since money is abundant and is only looking for a way out of pure consumer plays, some of these teams get funded and end up overvalued and unable to cut losses with an acqui-hire, the most likely scenario.

While I agree consumerization of the enterprise is real, this is not a sustainable approach. The definition of an MVP (much more feature complete) and iteration (much longer) as well as what it means to do customer development is very different in enterprise. Small merchants continue to think more and more like consumers and are becoming more tech savvy, and that leads to more usage of SaaS tools and more openness to outsourcing some non-core activities (in eCommerce, fraud prevention may well be considered non-core). That doesn’t mean they are open to testing any new tool that gets put out there; the time as well as expertise to integrate and evaluate its performance may be more than they can afford. You can’t trust your tool to just get picked up at random to a reasonable scale and learn from there, unless you have a very big war chest; then we go back to the funding issue.

Case in point is device fingerprinting (DFP) companies. A few years back DFP (a lot of times a glorified javascript) was all the rage. Since it wasn’t a text or flash based cookie most fraudsters, themselves not more than script kiddies, did not have the knowledge or tools to properly resist being profiled. As a result, for a while it worked well especially in reducing short term horizontally scaled attacks. Only there were a few problems: overfunded companies built too big a team, especially heavy on the Sales side since Sales cycles with financial institutions were long and require a lot of patience, as well as multiple integration solutions. Since the teams were big and sales took time each contract had to be big, so pricing went up as much as possible rather than adapt a freemium model that could boost adoption. Moreover, once fraudsters and engineers caught on it was easy to circumvent or duplicate, either internally at retailers and banks and by competitors. As a result, most of these companies are struggling and dealing mostly with litigation against competitors for some negligent IP.

In enterprise, specifically in security, one feature isn’t enough, starting lean is more complicated, and just a feature will not do not matter how many patent you have pending. One option is to take your time to come with a holistic solution, and that is tremendously harder to build (in fact, since FraudSciences was acquired, only Signifyd and Sift Science have tried building a standalone risk-as-a-service solution). The other is to start very slow and very lean, and raise very little capital. MaxMind is a good example of the latter. It’s a whole different world out there now, especially for enterprise startups. Make sure you build a real product that can sell. Don’t built a feature.

What’s Missing in Data Science Talks

1 Reply

On January 28th, 2008, the $169M sale of Israeli FraudSciences to eBay’s payments division PayPal was publicly announced. I was part of the 65 person crew and head of the analytics group at the time. FraudSciences became PayPal’s Israeli R&D center and is still a thriving team spanning more than 100 people and providing great value to the company. Our story has even been mentioned on StartUp Nation, in an inspired-by-a-true-story style dramatization of events.

The sale and its ramifications is not what I want to talk about, though; what I do want to talk about is the events that led to that sale, and more specifically the test that PayPal ran us through. You see, PayPal had to see whether our preposterous claims about how good our algorithms were held true, so they threw a good chunk of transactions at us to analyze and send back to them with our suggested decisions. Long story short, our results had an upside of up to 17% over PayPal’s own algorithms at the time, and the rest is history.

How did we do that, then? We must have had a ton of data. We must have used algorithm X or technique Y. We must have been masters of Hadoop. Wait – no. 2007. Nothing of the sort. Everything takes forever. To get to these results we didn’t even use the two famous patents FraudSciences viewed as huge assets since they required some sort of real time interaction with the buyer. What we did have were roughly 40,000 (indeed) well-tagged purchases, good segmentation, and great engineered features all geared at very well defined user behaviors. What we had, plain and simple, was strong domain expertise.

Domain expertise, or lack thereof, is exactly my issue with the talk about Data Science today. Here’s an example: I recently had a friend, a strong domain expert, rejected from a pretty nascent startup filled with very smart engineers since they didn’t really know where to place his non-developer profile in their team. Were they wrong to not hire him? Maybe, maybe not. I can’t judge. Were they wrong to make the decision based on coding skills? Most definitely. It’s a very common passion for data and ML geeks such as ourselves to embark on the (in my opinion) hubris-driven task of building an artificial intelligence that will solve all problems, the Generic SkyNet. We neglect to admit the need for specific knowledge. It is then when discussions of volume and structure of data sets replace keen understanding of what people are trying to achieve – when complex tools replace user research. Unsurprisingly, these attempts either fail or scale down to take domain by domain. They can still take over the world – just with a different strategy.

When I read people on Kaggle, in itself an amazing website and community, list the tools they threw at a dataset instead of how they led with a pure analysis of pattern and indicators, I cringe a little. This is a craft fueled by excess – in space, in memory, in computing power, even in data. While often times highly useful, almost as often does it make us miss the heuristic just in front of our eyes. I think that analysis and Data Science need to incorporate this realization as well, to become a real expertise.

Fraud detection and prevention and Credit issuance, the stuff we deal with on a daily basis at Klarna, are areas where this is an obvious issue. High fragmentation in geographies, payment instruments and products creates smaller training and validation sets than you’d ideally want. The need to wait for default or a chargeback limits the time between iterations. The presence of bad signals is scarce compared to other types of classification. Operational issues and fraudsters’ strong incentives to hide (as well as abuse or “friendly” fraud) cause “dirty” performance flags. And still we have a shop that uses a number of instances per segment that Data Science teams would frown upon to make some accurate decisions. How is that? The same way FraudSciences gave PayPal’s algorithms a run for their money – we use domain expertise to distill features that capture interaction in a way that automated feature engineering methods will find hard to imitate. We use bottom up analysis of behavioral patterns. We add a sprinkle of behavioral economics (but building a purchase flow is a completely different story).

This aspect of what we do is available to any Data Scientist out there – I’ve written extensively about finding domain experts. They’re around you. Use them – and don’t get hooked on the big guns just because they’re there*.

*Well, only if you want to get better results quicker and are acting under market and product constraints. If you’re a contributor to an open source project – carry on with your great work!

As Risky As It Gets

Thoughts about Fraud Prevention, Payments, Machine Learning and more – by Ohad Samet

Category Archives: Data Science

Dealing with Account Take Over? Here are my top tips (O’Reilly post)

Working on risk and fraud prevention? Don’t dig your career into a hole

The top 8 reasons you have a fraud problem – my talk at Balanced

Smart, non-techie people sought for new project

Forget Big Data

Even good things come to an end – Why I’m leaving Klarna

Feature companies in risk and fraud (or: enterprise and consumer startups are different)

What’s Missing in Data Science Talks