There are new available articles, click to refresh the page.

Before yesterdayYour RSS feeds

Weekly Update 460

By: Troy Hunt

This week's update is the last remote one for a while as we wind up more than a month of travel. I'm pushing this out just before we jump on the Qantas plane home... right after they've advised just how much of my data was impacted by their breach. That got me thinking in this week's video: what type of "third-party service" would expose those classes of data? My bet is on a party dealing with frequent flyers, perhaps a call centre or other processor responsible for managing their reward program. Hopefully, investigations will lead to transparency, and we'll find out, but I wouldn't be holding my breath on that timeline. For now, here are my thoughts:

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
The UK's NCA has picked up 4 individuals they've charged with the recent attacks on big retail (it's mostly the usual story of young guys, with one exception)
Looks like a heap of data points were exposed for my personal Qantas profile (compared to other family members, that is)
We've welcomed Push Security to Have I Been Pwned's partner program (they're now on the business-facing pages of the dashboard)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
July 13^th 2025 at 02:32

Troy Hunt
Welcoming Push Security to Have I Been Pwned's Partner Program
July 9^th 2025 at 22:50

Welcoming Push Security to Have I Been Pwned's Partner Program

By: Troy Hunt

As we gradually roll out HIBP’s Partner Program, we’re aiming to deliver targeted solutions that bridge the gap between being at risk and being protected. HIBP is the perfect place to bring these solutions to the forefront, as it's often the point at which individuals and organisations first learn of their exposure in data breaches. The challenge for corporates, in particular, is especially significant as they're tasked with protecting entire workforces, often against highly motivated and sophisticated attackers seeking to exploit organisational vulnerabilities. That's why today, I'm especially happy to welcome Push Security to the program.

Push's mandate is to "defend workforce identities in the browser" from attacks that put corporate assets at risk. Especially within the context of data breaches, this includes attacks that leverage reused credentials (which often appear in breaches), account takeovers, phishing and session hijacking. Protecting organisations directly in the browser makes a lot of sense given how many attacks originate in that environment (something I'm painfully familiar with myself), and as they're fond of saying, "Push Security is like EDR but for the browser".

Because Push is focused on business solutions, they now have placement within the business section of the HIBP dashboard, namely the overview and domains pages:

I'm really happy with how we've been able to position partners in a way that's contextual, relevant and non-obtrusive. We've clearly marked Push as "Sponsored" and positioned them right at the heart of where those protecting organisatoins spend their time on HIBP.

Lastly, we've also now launched a dedicated partners page, which lists each relationship we have, including Push Security:

Regardless of where you are in the world, you'll see each partner, the pages on which they are displayed, and any geolocation dependencies. This ensures both transparency and exposure for the organisations we've entrusted to help protect users of our service.

So, a big welcome to Push Security and one more piece in the puzzle of protecting organisations from the scourge of data breaches.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
July 9^th 2025 at 22:50

Troy Hunt
Weekly Update 459
July 8^th 2025 at 07:59

Weekly Update 459

By: Troy Hunt

New week, different end of the world! After a fleeting stop at home, we're in Japan for a proper holiday (yet somehow I'm still here writing this...) with the first stop in Tokyo. It's like nowhere else here, and this is now probably my 10th trip to Japan over a period of more than three decades. What I think has changed the most in terms of my perceptions of Japan is that back in the 90s, it was just so high tech here because we hadn't seen a lot of the stuff that was on the main streets of Tokyo. Now, the world is much more global; we're all using the same phones and watching the same TVs and nobody is talking about the Walkman any more. Same epic food though, and we've been smashing through some amazing dishes (full pics on Facebook). The next update will come from Kyoto before we head back to the sunny Aussie winter.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Yet another spyware maker has had their customer data land in HIBP (I think Catwatchful is now the 9th one there)
Also on Catwatchful, it looks like good old SQL injection still wreaking havoc (crazy this is still a thing)
Aussie identity protection service Truyu is the first new partner to be onboarded (that is, since 1Password in 2018)
Looks like I'm in yet another data breach, this time courtesy of our national airline (let's see if data appears anywhere...)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
July 8^th 2025 at 07:59

Troy Hunt
Welcoming Truyu to Have I Been Pwned's Partner Program
July 2^nd 2025 at 23:28

Welcoming Truyu to Have I Been Pwned's Partner Program

By: Troy Hunt

I always used to joke that when people used Have I Been Pwned (HIBP), we effectively said "Oh no - you've been pwned! Uh, good luck!" and left it at that. That was fine when it was a pet project used by people who live in a similar world to me, but it didn't do a lot for the everyday folks just learning about the scary world of data breaches. Partnering with 1Password in 2018 helped, but the impact of data breaches goes well beyond the exposure of passwords, so a couple of months ago, I wrote about finding new partners to help victims "after the breach", Today, I'm very happy to welcome the first such partner, Truyu.

I alluded to Truyu being an excellent example of a potential partner in the aforementioned blog post, so their inclusion in this program should come as no surprise, but let me embellish further. In fact, let's start with something very topical as of the moment of posting:

New email from @Qantas just now: “we believe your personal information was accessed during the cyber incident”. They definitely deserve credit for early communication. pic.twitter.com/dTLlvI0Byq
— Troy Hunt (@troyhunt) July 2, 2025

It's pure coincidence that Qantas' incident coincides with the onboarding of an Aussie identity protection service, but it also makes it all the more relevant. My own personal circumstances are a perfect example: apparently, my name, email address, phone number, date of birth, and frequent flyer number are now in the hands of a hacking group not exactly known for protecting people's privacy. In the earlier blog post about onboarding new partners, I showed how Truyu had sent me early alerts when my identity data was used to sign up for a couple of different financial services. If that happens as a result of the Qantas breach, at least I'm going to know about it early.

The introduction of Truyu as the first of several upcoming partners heralds the first time we've tailored content based on the geolocation of the user. What that means is that depending on where you are in the world, you may see something different to this:

I'm seeing Truyu on the Dropbox breach page because I'm in Australia, and if you're not, you won't. You'll have your own footer with your own country, which is based on Cloudflare's IP geolocation headers. In time, depending on where you are in the world, you'll see more content tailored specifically for you where it's relevant to your location. That's not just product placements either, we'll be adding other resources I'll share more about shortly.

Putting another brand name on HIBP is not something I take lightly, as is evidenced by the fact this is only the second time I've done this in nearly 12 years. Truyu is there because it's a product I genuinely believe provides value to data breach victims and in this case, one I also use myself. And for what it's worth, I've also spent time with the Truyu team in person on multiple occasions and have only positive things to say about them. That, in my book, goes a long way.

So, that's our new partner, and they've arrived at just the perfect time. Now I'm off to jump on a Qantas flight, wish me luck!

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
July 2^nd 2025 at 23:28

Troy Hunt
Weekly Update 458
July 2^nd 2025 at 08:08

Weekly Update 458

By: Troy Hunt

I'm in Austria! Well, I was in Austria, I'm now somewhere over the Aussie desert as I try and end this trip on top of my "to-do" list. The Have I Been Pwned Alpine Grand Tour was a great success with loads of time spent with govs, public meetups and users of this little data breach project that kinda escalated. As I say in the vid, I'm posting a lot more pics publicly to my Facebook page, so if you want to see the highlights, head over there. That's it for this week, it's home for a day then I'll come to you from Tokyo for the next one.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Have Fun Teaching was breached 4 years ago and 27k of their records are now in HIBP (they went very much "radio silence" after disclosure)
Robinsons Malls in the Philippines had a breach thay finally made its way into HIBP (the breach itself was back in June last year)
Because Teespring was frankly, appallingly bad, we have a new merch store courtesy of Fourthwall (if you ordered from Teespring and haven't received your merch, contact their support and if that doesn't work, dispute the charge with your card company)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
July 2^nd 2025 at 08:08

Troy Hunt
Weekly Update 457
June 21^st 2025 at 17:36

Weekly Update 457

By: Troy Hunt

Firstly, apologies for the annoying clipping in the audio. I use a Rode VideoMic that's a shotgun style that plugs straight into the iPhone and it's usually pretty solid. It was also solid when I tested it again now, just recording a video into the phone, so I don't know if this was connection related or what, but I was in no position to troubleshoot once the stream had started, unfortunately.

Moving on, it's been a ridiculously hectic week of bacb-to-back events then to top it off, we've bee dealing with crazy traffic volumes on HIBP:

Well, that explains the traffic: 2.46M visitors to Have I Been Pwned in 24 hours, mostly from Google searches. The inbound traffic is near unprecedented, with only the Collection 1 credential stuffing list in Jan 2019 and the Facebook scrape in April 2021 coming close. pic.twitter.com/li7qvfy9tk
— Troy Hunt (@troyhunt) June 21, 2025

Anyway, you just can't predict these things, hope you enjoy this week's video regardless.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
If you want to follow along with travels, most of the pics I post these days are going to a public Facebook account (such is the fragmented social media world today)
Catch me in Rome next week for the DotNetCode Italy meetup (that'll be the last public event of the tour)
Was it really 16B passwords? (obviously this story got huge traction, let's see what the data says)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
June 21^st 2025 at 17:36

Troy Hunt
Weekly Update 456
June 12^th 2025 at 09:51

Weekly Update 456

By: Troy Hunt

It's time to fly! It's two months to the day since we came back from the last European trip, again spending the time with some of the agencies and partners we've fostered at HIBP over the years. This time, it's the driving tour I talked about earlier last month, and we have absolutely jam-packed it! But hey, it's a part of the world I love driving in, it's summer over there (I know, it's a bit upside-down in that half of the world), and there are lots of cool people and places to see. Interesting, Switzerland was by far the most dominant "come and say g'day" country, and we've ended up with events or meetups in Zurich, Bern and Geneva, along with invites in other places we just didn't have time to make work. But Switzerland is awesome, so perhaps that's a place for a longer stay next time with a little less grand touring. Regardless, I'll come to you with another live stream next Friday from Monaco 😎

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
Catch me in Zurich on Monday (that one is courtesy of the Azure Zurich User Group)
And in Rome the week after (thank you DotNetCode Italy for hosting!)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
June 12^th 2025 at 09:51

Troy Hunt
Weekly Update 455
June 9^th 2025 at 08:27

Weekly Update 455

By: Troy Hunt

The bot-fighting is a non-stop battle. In this week's video, I discuss how we're tweaking Cloudflare Turnstile and combining more attributes around how bot-like requests are, and... it almost worked. Just as I was preparing to write this intro, I found a small spike of anomalous traffic that, upon further investigation, should have been blocked. So we've pivoted again, adding yet more logic to try and give legit humans the best experience possible whilst making it painful for the bots. Fortunately, we're doing this with resources that have minimal impact if a limited number of bot requests come through, but it does make for a challenging if not somewhat infuriating experience.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
We've now identified the first round of partners to onboard to HIBP (these are companies that can help victims "after the breach")
ColoCrossing had a breach that exposed 7k customer email addresses for their cloud service (looks like this just ColoCloud)
We love the HIBP merch store, but Teespring's support is absolutely woeful (we'll move to an alternate provider in the very near future)
We're still tweaking Cloudflare's Turnstile to keep the bad guys out and the good guys in (that's a link to the HIBP homepage which we think we have dialed in pretty good now, see if you get a nice async request or a full page post-back)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
June 9^th 2025 at 08:27

Troy Hunt
Weekly Update 454
June 2^nd 2025 at 10:26

Weekly Update 454

By: Troy Hunt

We're two weeks in from the launch of the new HIBP, and I'm still recovering. Like literally still recovering from the cold I had last week and the consequent backlog. A major launch like this isn't just something you fire and forget; instead, it takes weeks of tweaks and refinements to iron out all the little creases, both known and unpredictable. None of them have been significant, fortunately, but the more I look at it, the more I see, and the more we refine. This week, we're diving headfirst into something I'd rather avoid: wacky procurement demands. Stuff like quote generation so that you can have the same stuff as you can find on the pricing page right now, just as a PDF with your name on it 🤦‍♂️ And look, I get it - it's not the people reading this making those demands and I have tread in your shoes and felt your pain. Hopefully, sometime this week, we'll automate away both your and my pain, and that'll be a massive step forward for all of us. Stay tuned!

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
I'm coming to Zurich! (now at the correct date of June 16)
The Fédération Francaise de Rugby breach turned up (282k people in there, including with their DoBs for some reason 🤷‍♂️)
Sticking with the French theme, their "Free" ISP data popped up too (another 14M people there, also with dates of birth 🤷‍♂️)
And the second coming of Operation Endgame also made its way to HIBP (with support from our friends in LEA 👮)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
June 2^nd 2025 at 10:26

Troy Hunt
Weekly Update 453
May 27^th 2025 at 00:26

Weekly Update 453

By: Troy Hunt

Well, the last few weeks of insane hours finally caught up with me 🤒 Not badly, but I evidently burned enough midnight oil to leave the immune system somewhat degraded and just after recording this video, I really didn't feel like doing much at all. Some congestion and sniffles aside, it's really not that bad, but definitely evidence of a very intense period, which thankfully, is now behind us. So, this week, let's talk about that awesome new HIBP website 😊

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
We launched! (the end of one era, the beginning of another)
Cloudflare's Turnstile is protecting a bunch of features in the new HIBP site from automation (but we do need to work on the rate at which it thinks real people are bots)
I later put out a poll on the rate at which Turnstile was blocking access (when I speculated about 10%, I was pretty close - it's actually 8.7%)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
May 27^th 2025 at 00:26

Have I Been Pwned 2.0 is Now Live!

By: Troy Hunt

This has been a very long time coming, but finally, after a marathon effort, the brand new Have I Been Pwned website is now live!

Have I Been Pwned 2.0 is Now Live!

Feb last year is when I made the first commit to the public repo for the rebranded service, and we soft-launched the new brand in March of this year. Over the course of this time, we've completely rebuilt the website, changed the functionality of pretty much every web page, added a heap of new features, and today, we're even launching a merch store 😎

Let me talk you through just some of the highlights, strap yourself in!

The Search

The signature feature of HIBP is that big search box on the front page, and now, it's even better - it has confetti!

Well, not for everyone, only about half the people who use it will see a celebratory response. There's a reason why this response is intentionally jovial, let me explain:

As Charlotte and I have travelled and spent time with so many different users of the service around the world, a theme has emerged over and over again: HIBP is a bit playful. It's not a scary place emblazoned with hoodies, padlock icons, and fearmongering about "the dark web". Instead, we aim to be more consumable to the masses and provide factual, actionable information without the hyperbole. Confetti guns (yes, there are several, and they're animated) lighten the mood a bit. The alternative is that you get the red response:

There was a very brief moment where we considered a more light-hearted treatment on this page as well, but somehow a bit of sad trombone really didn't seem appropriate, so we deferred to a more demure response. But now it's on a timeline you can scroll through in reverse chronological order, with each breach summarising what happened. And if you want more info, we have an all-new page I'll talk about in a moment.

Just one little thing first - we've dropped username and phone number search support from the website. Username searches were introduced in 2014 for the Snapchat incident, and phone number searches in 2021 for the Facebook incident. And that was it. That's the only time we ever loaded those classes of data, and there are several good reasons why. Firstly, they're both painful to parse out of a breach compared to email addresses, which we simply use a regex to extract (we've open sourced the code that does this). Usernames are a string. Phone numbers are, well, it depends. They're not just numbers because if you properly internationalise them (like they were in the Facebook incident), they've also got a plus at the front, but they're frequently all over the place in terms of format. And we can't send notifications because nobody "owns" a username, and phone numbers are very expensive to send SMSs to compared to sending emails. Plus, every other incident in HIBP other than those two has had email addresses, so if we're asking "have I been pwned?" we can always answer that question without loading those two hard-to-parse fields, which usually aren't present in most breaches anyway. When the old site offered to accept them in the search box, it created confusion and support overhead: "why wasn't my number in the [whatever] breach?!". That's why it's gone from the website, but we've kept it supported on the API to ensure we don't break anything... just don't expect to see more data there.

The Breach Page

There are many reasons we created this new page, not least of which is that the search results on the front page were getting too busy, and we wanted to palm off the details elsewhere. So, now we have a dedicated page for each breach, for example:

That's largely information we had already (albeit displayed in a much more user-friendly fashion), but what's unique about the new page is much more targeted advice about what to do after the breach:

I recently wrote about this section and how we plan to identify other partners who are able to provide appropriate services to people who find themselves in a breach. Identity protection providers, for example, make a lot of sense for many data breaches.

Now that we're live, we'll also work on fleshing this page out with more breach and user-specific data. For example, if the service supports 2FA, then we'll call that out specifically rather than rely on the generic advice above. Same with passkeys, and we'll add a section for that. A recent discussion with the NCSC while we were in the UK was around adding localised data breach guidance, for example, showing folks from the UK the NCSC logo and a link to their resource on the topic (which recommends checking HIBP 🙂).

I'm sure there's much more we can do here, so if you've got any great ideas, drop me a comment below.

The Dashboard

Over the course of many years, we introduced more and more features that required us to know who you were (or at least that you had access to the email address you were using). It began with introducing the concept of a sensitive breach during the Ashley Madison saga of 2015, which meant the only way to see your involvement in that incident was to receive an email to the address before searching. (Sidenote: There are many good reasons why we don't do that on every breach.) In 2019, when I put an auth layer around the API to tackle abuse (which it did beautifully!) I required email verification first before purchasing a key. And more things followed: a dedicated domain search dashboard, managing your paid subscription and earlier this year, viewing stealer logs for your email address.

We've now unified all these different places into one central dashboard:

From a glance at the nav on the left, you can see a lot of familiar features that are pretty self-explanatory. These combine relevant things for the masses and those that are more business-oriented. They're now all behind the one "Sign In" that verifies access to the email address before being shown. In the future, we'll also add passkey support to avoid needing to send an email first.

The dashboard approach isn't just about moving existing features under one banner; it will also give us a platform on which to build new features in the future that require email address verification first. For example, we've often been asked to provide people with the ability to subscribe their family's email addresses to notifications, yet have them go to a different address. Many of us play tech support for others, and this would be a genuinely useful feature that makes sense to place at a point where you've already verified your email address. So, stay tuned for that one, among many others.

The Domain Search Feature

More time went into this one feature than most of the other ones combined. There's a lot we've tried to do here, starting with a much cleaner list of verified domains:

The search results now give a much cleaner summary and add filtering by both email address and a hotly requested new feature - just the latest breach (it's in the drop-down):

All those searches now just return JSON from APIs and the whole dashboard acts as a single-page app, so everything is really snappy. The filtering above is done purely client-side against the full JSON of the domain search, an approach we've tested with domains of over a quarter million breached email addresses and still been workable (although arguably, you really want that data via the API rather than scrolling through it in a browser window).

Verification of domain ownership has also been completely rewritten and has a much cleaner, simpler interface:

We still have work to do to make the non-email verification methods smoother, but that was the case before, too, so at least we haven't regressed. That'll happen shortly, promise!

The API

First things first: there have been no changes to the API itself. This update doesn't break anything!

There's a discussion over on the UX rebuild GitHub repo about the right way to do API documentation. The general consensus is OpenAPI and we started going down that route using Scalar. In fact, you can even see the work Stefan did on this here at haveibeenpwned.com/scalar:

It's very cool, especially the way it documents samples in all sorts of different languages and even has a test runner, which is effectively Postman in the browser. Cool, but we just couldn't finish it in time. As such, we've kept the old documentation for now and just styled it so it looks like the rest of the site (which I reckon is still pretty slick), but we do intend to roll to the Scalar implementation when we're not under the duress of such a big launch.

The Merch Store

You know what else is awesome? Merch! No, seriously, we've had so many requests over the years for HIBP branded merch and now, here we are:

We actually now have a real-life merch store at merch.haveibeenpwned.com! This was probably the worst possible use of our time, considering how much mechanical stuff we had to do to make all the new stuff work, but it was a bit of a passion project for Charlotte, so yeah, now you can actually buy HIBP merch. It's all done through Teespring (where have I heard that name before?!) and everything listed there is at cost price - we make absolutely zero dollars, it's just a fun initiative for the community 🙂

We did try out their option for stickers too, but they fell well short of what we already had up with our little one-item store on Sticker Mule so for now, that remains the go-to for laptop decorations. Or just go and grab the open source artwork and get your own printed from wherever you please.

The Nerdy Bits

We still run the origin services on Microsoft Azure using a combination of the App Service for the website, "serverless" Functions for most APIs (there are still a few async ones there that are called as a part of browser-based features), SQL Azure "Hyperscale" and storage account features like queues, blobs and tables. Pretty much all the coding there is C# with .NET 9.0 and ASP.NET MVC on .NET Core for the web app. Cloudflare still plays a massive role with a lot of code in workers, data in R2 storage and all their good bits around WAF and caching. We're also now exclusively using their Turnstile service for anti-automation and have ditched Google's reCAPTCHA completely - big yay!

The front end is now latest gen Bootstrap and we're using SASS for all our CSS and TypeScript for all our JavaScript. Our (other) man in Iceland Ingiber has just done an absolutely outstanding job with the interfaces and exceeded all our expectations by a massive margin. What we have now goes far beyond what we expected when we started this process, and a big part of that has been Ingiber's ability to take a simple requirement and turn it into a thing of beauty 😍 I'm very glad that Charlotte, Stefan and I got to spend time with him in Reykjavik last month and share some beers.

We also made some measurable improvements to website performance. For example, I ran a Pingdom website speed test just before taking the old one offline:

And then ran it over the new one:

So we cut out 28% of the page size and 31% of the requests. The load time is much of a muchness (and it's highly variable at that), but having solid measures for all the values in the column on the right is a very pleasing result. Consider also the commentary anyone in web dev would have seen over the years about how much bigger web pages have become, and here we are shaving off solid double-digit percentages 11 years later!

Finally, anything that could remotely be construed as tracking or ad bloat just isn't there, because we simply don't do any of that 🙂 In fact, the only real traffic stats we have are based on what Cloudflare sees when the traffic flows through their edge nodes. And that 1Password product placement is, as it's always been, just text and an image. We don't even track outbound clicks, that's up to them if they want to capture that on the landing page we link to. This actually makes discussions such as we're having with identity theft companies that want product placement much harder as they're used to getting the sorts of numbers that invasive tracking produces, but we wouldn't have it any other way.

The AI

I wanted to make a quick note of this here, as AI seems to be either constantly overblown or denigrated. Either it's going to solve the world's problems, or it just produces "slop". I used Chat GPT in particular really extensively during this rebuild, especially in the final days when time got tight and my brain got fried. Here are some examples where it made a big difference:

I'm using Bootstrap icons from here: https://icons.getbootstrap.com/

What's a good icon to illustrate a heading called "Index"?

This was right at the 11th hour when we realised we didn't have time to implement Scalar properly, and I needed to quickly migrate all the existing API docs to the new template. There are over 2,000 icons on that page, and this approach meant it took about 30 seconds to find the right one, each and every time.

We killed off some pages on the old site, but before rolling it over, I wanted to know exactly what was there:

Write me a PowerShell script to crawl haveibeenpwned.com and write out each unique URL it finds

And then:

Now write a script to take all the paths it found and see if they exist on stage.haveibeenpwned.com

It found good stuff too, like the security.txt file I'd forgotten to migrate. It also found stuff that never existed, so it's the usual "trust, but verify" situation.

And just a gazillion little things where every time I needed anything from some CSS advice to configuring Cloudflare rules to idiosyncrasies in the .NET Core web app, the correct answer was seconds away. I'd say it was right 90% of the time, too, and if you're not using AI aggressively in your software development work now (and I'm sure there are much better ways, too) I'm pretty confident in saying "you're doing it wrong".

The Journey Here

It's hard to explain how much has gone into this, and that goes well beyond just what you see in front of you on the website today. It's seemingly little things, like minor revisions to the terms of use and privacy policy, which required many hours of time and thousands of dollars with lawyers (just minor updates to how we process data and a reflection of new services such as the stealer logs).

We pushed out the new site in the wee hours of Sunday morning my time, and almost everything went well:

One or two little glitches that we've fixed and pushed quickly, that's it. I've actually waited until now, 2 days after going live, to publish this post just so we could iron out as much stuff as possible first. We've pushed more than a dozen new releases already since that time, just to keep iterating and refining quickly. TBH, it's been a bit intense and has been an enormously time-consuming effort that's dominated our focus, especially over the last few weeks leading up to launch. And just to drive that point home, I literally got a health alert first thing Monday morning:

Nothing like empirical data to make a point! That last weekend when we went live was especially brutal; I don't think I've devoted that much high-intensity time to a software release for decades.

Have I Been Pwned has been a passion for a quarter of my life now. What I built in 2013 was never intended to take me this far or last this long, and I'm kinda shocked it did if I'm honest. I feel that what we've built with this new site and new brand has elevated this little pet project into a serious service that has a new level of professionalism. But I hope that in reading this, you see that it has maintained everything that has always been great about the service, and I'm so glad to still be here writing about it today in the 205th blog post with that tag. Thanks for reading, now go and enjoy the new website 😊

Edit (a few hours after initially posting): Let me expand on Cloudflare's Turnstile as it'll explain some idiosyncrasies some people have seen:

This is an anti-automation approach that doesn't involve palming traffic to Google (like reCAPTCHA did), and it can be implemented completely invisibly. There are more invasive implementations of it, but we're trying to be seamless here. It involves some Cloudflare script running in the browser and providing a challenge, which is then submitted with the HTTP request and verified server side. We've had it on HIBP in one form or another since 2023, and it can be awesome... until it isn't. If the challenge fails, what happens next? It depends.

On forms where we really need to block the robots (for example, any that send email), a failed Turnstile challenge was initially just showing a red error. It now says this:

Our anti-automation process thinks you're a bot, which you're obviously not! Try behaving like a human and clicking the button again and if it still misbehaves, give the page a reload.

We've often found a second click or a page reload solves the problem, so hopefully this sends people in the right direction. If it doesn't, we'll need to look at more in-your-face implementations of Turnstile that show a widget you need to interact with. To have a go yourself and see it in action, try the dashboard sign in page.

The other place Turnstile features heavily is on the main search page at the root of the site. We don't want that API being hit by bots, so it's a must have there. Here, like on the other pages of the new site, we're asynchronously posting to API endpoints and sending the challenge token along with the request. What we're doing differently on the front page, however, is that if the challenge fails and returns HTTP 401 when posted to the HIBP endpoint (you'll also see a response body of "Invalid Turnstile token"), we were meant to be falling back to a full page post. That wasn't happening in the new site when we first launched it. But it is now 🙂

When the full page post back occurs, Cloudflare will present a managed challenge. This is much more invasive, but it's also much more reliable and will then serve the same result as you would have seen anyway, albeit via a full page load. We implement the same managed challenge logic on the deep-linked account pages, which you can see here: https://haveibeenpwned.com/account/test@example.com

According to the Cloudflare stats, about 82% of all our issued challenges are successfully solved:

Of the 18% that aren't, many will be due to bots stopped by Turnstile doing exactly what it's meant to do. It's likely a single-digit percentage of requests that are real humans being impeded, and we need to look at ways to get that number down, but at least the fallback positions are improved now. If you were having problems, give the site a good refresh, see how you go and leave your feedback in the comments below.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
May 19^th 2025 at 20:19

Troy Hunt
Weekly Update 452
May 16^th 2025 at 21:12

Weekly Update 452

By: Troy Hunt

Funny how excited people can get about something as simple as a sticker. They're always in hot demand and occupy an increasingly large portion of my luggage as we travel around. Charlotte reckoned it would be the same for other merch too, so, while I've been beavering away playing code monkey on the rebranded HIBP website, she built a merch store. Talking about it in this week's video obviously got a bunch of people excited, as a flurry of orders followed. As I said in the video, we put everything up there at cost (ok, so Teepsring made us add 1c to each because you couldn't list exactly at cost), so it's just a fun way to enjoy the new HIBP brand more than anything. Enjoy the merch and this week's video, next week we'll have a brand new site live and ready to talk about 😊

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Malaysia became our 40th government to take up the HIBP service (actually our first gov from Asia, too)
We're going to put a small number of carefully selected partners on breach pages in HIBP (we want companies that can add something genuinely useful to breach victims)
Merch! (what are we missing?)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
May 16^th 2025 at 21:12

Troy Hunt
Welcoming the Malaysian Government to Have I Been Pwned
May 15^th 2025 at 19:30

Welcoming the Malaysian Government to Have I Been Pwned

By: Troy Hunt

Today, we welcome the 40th government onboarded to Have I Been Pwned's free gov service, Malaysia. The NC4 NACSA (National Cyber Coordination and Command Centre of the National Cyber Security Agency) in Malaysia now has full access to query all their government domains via API, and monitor them against future breaches.

Malaysia is the first Asian nation to make use of this service, and we look forward to seeing many more from this corner of the world in the future.

🏷️ My labels
- ❌
Article tags
- ❌
- Government
May 15^th 2025 at 19:30

Troy Hunt
Weekly Update 451
May 10^th 2025 at 22:28

Weekly Update 451

By: Troy Hunt

The Have I Been Pwned Alpine Grand Tour is upon us! I've often joked that work is always either sitting at my desk at home in isolation or on the other side of the world, and so it is with this trip. As we've done with recent travel to the US and colder parts of Europe, we've booked to travel to places we know have lots of people we're interested in seeing then we'll fill in the itinerary. Since the blog post last week, we've lined up folks in Leichtenstein, Zurich (which will be a publicly event I'll announce soon), Bern, Geneva and Lyon. I'm still trying to make contact with the folks at CERT-MC in Monaco, and same with the Italian equivalent in Rome. I've planned a bit more time at the latter and would like to try and line up another event like we'll be doing in Zurich so if you're over that way and run a user group or similar, I'd love to hear from you.

References

Sponsored by: Join Snyk's May 15th event to discover how to establish a Security Champions program, bridging security and development
If you're interested in a cool panel for putting Home Assistant on the wall somewhere, check out this thread ()
Gambia's national CSRIT is now the 38th gov on HIBP (they're the first African nation to come on board)
And the Isle of Man is the 39th (they're a "self-governing British Crown Dependency", so I've learned something new this week)
Passkeys for normi... normal people! (they can be really simple to setup and use, but that's highly dependent on how the service implements them)
The HIBP Alpine Grand Tour is next month (summer, the Alps, cyber, what more could you want?! 😄)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
May 10^th 2025 at 22:28

Troy Hunt
After the Breach: Finding new Partners with Solutions for Have I Been Pwned Users
May 8^th 2025 at 22:33

After the Breach: Finding new Partners with Solutions for Have I Been Pwned Users

By: Troy Hunt

For many years, people would come to Have I Been Pwned (HIBP), run a search on their email address, get the big red "Oh no - pwned!" response and then... I'm not sure. We really didn't have much guidance until we partnered with 1Password and started giving specific advice about how to secure your digital life. So, that's passwords sorted, but the impact of data breaches goes well beyond passwords alone...

There are many different ways people are impacted by breaches, for example, identity fraud. Breaches frequently contain precisely the sort of information that opens the door to impersonation and just taking a quick look at the HIBP stats now, there's a lot of data out there:

227 breaches exposed physical address
243 breaches exposed date of birth
288 breaches exposed phone numbers

That's just the big numbers, then there's the long tail of all sorts of other exposed high-risk data, including partial credit cards (32 breaches), government-issued IDs (18 breaches) and passport numbers (7 breaches). As well as helping people choose good passwords, we want to help them stay safe in the other aspects of their lives put at risk when hackers run riot.

Identity protection services are a good example, and I might be showing my age here, but I've been using them since the 90's. Today, I use a local Aussie one called Truyu which is built by the Commonwealth Bank. Let me give you two examples from them to illustrate why it's a useful service:

The first one came on Melbourne Cup day last year, a day when Aussies traditionally get drunk and lose money betting on horse races. Because gambling (sorry - "gaming") is a heavily regulated industry, a whole bunch of identity data has to be provided if you want to set up an account with the likes of SportsBet. Whilst I personally maintain that gambling is a tax on people who can't do maths, Charlotte was convinced we should have a go anyway, which resulted in Truyu popping up this alert:

This was me (and yes, of course we lost everything we bet) but... what if it wasn't me, and my personal information had been used by someone else to open the account? That's the sort of thing I'd want to know about fast. As for all those "Illion Credit Header" entries, I asked Truyu to help explain what they mean and why they're important to know:

Illion Credit Header – Banking Finance Segment : This segment includes information that links you to financial institutions—such as banks, lenders, or credit card provider. It helps confirm your financial presence and association with trusted entities, but it can also reveal if your identity is being used across multiple banks fraudulently.
Illion Credit Header – Telecommunications Segment: This covers data from telco providers (e.g., Optus, Telstra, Vodafone), indicating that your identity has been used to open or inquire about telco services. Telco accounts are often targeted for fraud (SIM swaps, device purchases), so unexpected entries here can flag potential misuse of your ID.
Illion Credit Header – Utilities Segment - This segment includes information showing you've been associated with utility services like electricity, gas, or water. If someone uses your ID to set up a utility account, it will show here—often before more obvious signs of fraud occur.
Illion Credit Header – Public Records Segment: This includes any publicly available identity-linked records, such as: Court judgements, Bankruptcies, ASIC or other official listings

Yep, I'd definitely want to know if it wasn't me that initiated all that!

Then, on a recent visit to see the Irish National Cyber Security Centre, we found ourselves hungry in Dublin. Google Maps recommended this epic sushi place, but when we arrived, a sign at the front advised they didn't accept credit cards - in 2025!! Carrying only digital cards, having no cash and being hungry for sushi, I explored the only other avenue the store suggested: creating a Revolut account. Doing so required a bunch of personal information because, like betting, finance is a heavily regulated industry. This earned me another early warning from Truyu about the use of my data:

I pay Truyu A$4.99 each month via a subscription on my iPhone, and IMHO, it's money well spent. For full disclosure, Truyu is also an enterprise subscriber to HIBP (like 1Password is), and you can see breaches we've processed in their app too. I've included them here because they're a great example of a service that adds real value "after the breach", and it's one I genuinely use myself.

The point of all this is that there are organisations out there offering services that are particularly relevant to data breach victims, and we'd like to find the really good ones and put them on the new HIBP website. We've even built out some all-new dedicated spaces, for example on the new breach page:

After the Breach: Finding new Partners with Solutions for Have I Been Pwned Users

But choosing partners is a bit more nuanced than that. For example, a service like Truyu caters to an Aussie audience, and the way identity protection works in the US or UK, for example, is different. We need different partners in different parts of the world, and further, offering different services. Identity protection is one thing, but what else? There are many different risks that both individuals and organisations (of which there are hundreds of thousands using HIBP today) face after being in a data breach.

So, we're looking for more partners that can make a positive difference for the folks that land on HIBP, do a search and then ask "now what?!" We're obviously going to be very selective and very cautious about who we work with because the trust people have in HIBP is not something I'll ever jeopardise by selecting the wrong partners. And, of course, any other brand that appears on this site needs to be one that reflects not just our values and mission, but is complementary to our favourite password manager as well.

Now that we're on the cusp of launching this new site (May 17 is our target), I'm inviting any organisations that think they fit the bill to get in touch with me and explain how they can make a positive difference to data breach victims looking for answers "after the breach".

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
May 8^th 2025 at 22:33

Troy Hunt
Welcoming the Isle of Man Government to Have I Been Pwned
May 8^th 2025 at 07:00

Welcoming the Isle of Man Government to Have I Been Pwned

By: Troy Hunt

Today we welcome the 39th government and first self-governing British Crown Dependency to Have I Been Pwned, The Isle of Man. Their Office of Cyber-Security & Information Assurance (OCSIA) now has free and open access to query the government domains of their jurisdiction.

We're delighted and encouraged to see HIBP put to good use across such a wide variety of government use cases and look forward to seeing many more in the future.

🏷️ My labels
- ❌
Article tags
- ❌
- Government
May 8^th 2025 at 07:00

Troy Hunt
Passkeys for Normal People
May 5^th 2025 at 08:12

Passkeys for Normal People

By: Troy Hunt

Let me start by very simply explaining the problem we're trying to solve with passkeys. Imagine you're logging on to a website like this:

And, because you want to protect your account from being logged into by someone else who may obtain your username and password, you've turned on two-factor authentication (2FA). That means that even after entering the correct credentials in the screen above, you're now prompted to enter the six-digit code from your authenticator app:

There are a few different authenticator apps out there, but what they all have in common is that they display a one-time password (henceforth referred to as an OTP) with a countdown timer next to it:

By only being valid for a short period of time, if someone else obtains the OTP then they have a very short window in which it's valid. Besides, who can possibly obtain it from your authenticator app anyway?! Well... that's where the problem lies, and I demonstrated this just recently, not intentionally, but rather entirely by accident when I fell victim to a phishing attack. Here's how it worked:

Passkeys for Normal People

I was socially engineered into visiting a phishing page that pretended to belong to Mailchimp who I use to send newsletters for this blog. The website address was mailchimp-sso.com, which was close enough to the real address (mailchimp.com) to be feasible. "SSO" is "single sign on", so also seemed feasible.
When I saw the login screen (the one with the big "PHISH" stamp on it), and submitted my username and password to them, the phishing site then automatically used those credentials to begin the login process on Mailchimp.
Mailchimp validated the credentials, and because I had 2FA turned on, then displayed the OTP request screen.
The legitimate OTP screen from Mailchimp was then returned to the bad guys...
...who responded to my login request with their own page requesting the OTP.
I entered the code into the form and submitted it to the phishing site.
The bad guys then immediately sent that request to Mailchimp, thus successfully logging themselves in.

The problem with OTPs from authenticator apps (or sent via SMS) is that they're phishable in that it's possible for someone to trick you into handing one over. What we need instead is a "phishing-resistant" paradigm, and that's precisely what passkeys are. Let's look at how to set them up, how to use them on websites and in mobile apps, and talk about what some of their shortcomings are.

Passkeys for Log In on Mobile with WhatsApp

We'll start by setting one up for WhatsApp given I got a friendly prompt from them to do this recently:

So, let's "Try it" and walk through the mechanics of what it means to setup a passkey. I'm using an iPhone, and this is the screen I'm first presented with:

A passkey is simply a digital file you store on your device. It has various cryptographic protections in the way it is created and then used to login, but that goes beyond the scope of what I want to explain to the audience in this blog post. Let's touch briefly on the three items WhatsApp describes above:

The passkey will be used to logon to the service
It works in conjunction with how you already authenticate to your device
It needs to be stored somewhere (remember, it's a digital file)

That last point can be very device-specific and very user-specific. Because I have an iPhone, WhatsApp is suggesting I save the passkey into my iCloud Keychain. If you have an Android, you're obviously going to see a different message that aligns to how Google syncs passkeys. Choosing one of these native options is your path of least resistance - a couple of clicks and you're done. However...

I have lots of other services I want to use passkeys on, and I want to authenticate to them both from my iPhone and my Windows PC. For example, I use LinkedIn across all my devices, so I don't want my passkey tied solely to my iPhone. (It's a bit clunky, but some services enable this by using the mobile device your passkey is on to scan a QR code displayed on a web page). And what if one day I switch from iPhone to Android? I'd like my passkeys to be more transferable, so I'm going to store them in my dedicated password manager, 1Password.

A quick side note: as you'll read in this post, passkeys do not necessarily replace passwords. Sometimes they can be used as a "single factor" (the only thing you use to login with), but they may also be used as a "second factor" with the first being your password. This is up to the service implementing them, and one of the criticisms of passkeys is that your experience with them will differ between websites.

We still need passwords, we still want them to be strong and unique, therefore we still need password managers. I've been using 1Password for 14 years now (full disclosure: they sponsor Have I Been Pwned, and often sponsor this blog too) and as well as storing passwords (and credit cards and passport info and secure notes and sharing it all with my family), they can also store passkeys. I have 1Password installed on my iPhone and set as the default app to autofill passwords and passkeys:

Because of this, I'm given the option to store my WhatsApp passkey directly there:

The obfuscated section is the last four digits of my phone number. Let's "Continue", and then 1Password pops up with a "Save" button:

Once saved, WhatsApp displays the passkey that is now saved against my account:

And because I saved it into 1Password that syncs across all my devices, I can jump over to the PC and see it there too.

And that's it, I now have a passkey for WhatsApp which can be used to log in. I picked this example as a starting point given the massive breadth of the platform and the fact I was literally just prompted to create a passkey (the very day my Mailchimp account was phished, ironically). Only thing is, I genuinely can't see how to log out of WhatsApp so I can then test using the passkey to login. Let's go and create another with a different service and see how that experience differs.

Passkeys For Log In via PC with LinkedIn

Let's pick another example, and we'll set this one up on my PC. I'm going to pick a service that contains some important personal information, which would be damaging if it were taken over. In this case, the service has also previously suffered a data breach themselves: LinkedIn.

I already had two-step verification enabled on LinkedIn, but as evidenced in my own phishing experience, this isn't always enough. (Note: the terms "two-step", "two-factor" and "multi-factor" do have subtle differences, but for the sake of simplicity, I'll treat them as interchangeable terms in this post.)

Onto passkeys, and you'll see similarities between LinkedIn's and WhatsApp's descriptions. An important difference, however, is LinkedIn's comment about not needing to remember complex passwords:

Let's jump into it and create that passkey, but just before we do, keep in mind that it's up to each and every different service to decide how they implement the workflow for creating passkeys. Just like how different services have different rules for password strength criteria, the same applies to the mechanics of passkey creation. LinkedIn begins by requiring my password again:

This is part of the verification process to ensure someone other than you (for example, someone who can sit down at your machine that's already logged into LinkedIn), can't add a new way of accessing your account. I'm then prompted for a 6-digit code:

Which has already been sent to my email address, thus verifying I am indeed the legitimate account holder:

As soon as I enter that code in the website, LinkedIn pushes the passkey to me, which 1Password then offers to save:

Again, your experience will differ based on which device and preferred method of storing passkeys you're using. But what will always be the same for LinkedIn is that you can then see the successfully created passkey on the website:

Now, let's see how it works by logging out of LinkedIn and then returning to the login page. Immediately, 1Password pops up and offers to sign me in with my passkey:

That's a one-click sign-in, and clicking the purple button immediately grants me access to my account. Not only will 1Password not let me enter the passkey into a phishing site, due to the technical implementation of the keys, it would be completely unusable even if it was submitted to a nefarious party. Let me emphasise something really significant about this process:

Passkeys are one of the few security constructs that make your life easier, rather than harder.

However, there's a problem: I still have a password on the account, and I can still log in with it. What this means is that LinkedIn has decided (and, again, this is one of those website-specific decisions), that a passkey merely represents a parallel means of logging in. It doesn't replace the password, nor can it be used as a second factor. Even after generating the passkey, only two options are available for that second factor:

The risk here is that you can still be tricked into entering your password into a phishing site, and per my Mailchimp example, your second factor (the OTP generated by your authenticator app) can then also be phished. This is not to say you shouldn't use a passkey on LinkedIn, but whilst you still have a password and phishable 2FA, you're still at risk of the same sort of attack that got me.

Passkeys for 2FA with Ubiquiti

Let's try one more example, and this time, it's one that implements passkeys as a genuine second factor: Ubiquiti.

Ubiquiti is my favourite manufacturer of networking equipment, and logging onto their system gives you an enormous amount of visibility into my home network. When originally setting up that account many years ago, I enabled 2FA with an OTP and, as you now understand, ran the risk of it being phished. But just the other day I noticed passkey support and a few minutes later, my Ubiquiti account in 1Password looked like this:

I won't bother running through the setup process again because it's largely similar to WhatsApp and LinkedIn, but I will share just what it looks like to now login to that account, and it's awesome:

I intentionally left this running at real-time speed to show how fast the login process is with a password manager and passkey (I've blanked out some fields with personal info in them). That's about seven seconds from when I first interacted with the screen to when I was fully logged in with a strong password and second factor. Let me break that process down step by step:

When I click on the "Email or Username" field, 1Password suggests the account to be logged in with.
I click on the account I want to use and 1Password validates my identity with Face ID.
1Password automatically fills in my credentials and submits the form.
Ubiquiti asks for my passkey, I click "Continue" and my iPhone uses Face ID again to ensure it's really me.
The passkey is submitted to Ubiquiti and I'm successfully logged in. (As it was my first login via Chrome on my iPhone, Ubiquiti then asks if I want to trust the device, but that happens after I'm already successfully logged in.)

Now, remember "the LinkedIn problem" where you were still stuck with phishable 2FA? Not so with Ubiquiti, who allowed me to completely delete the authenticator app:

But there's one more thing we can do here to strengthen everything up further, and that's to get rid of email authentication and replace it with something even stronger than a passkey: a U2F key.

Physical Universal 2 Factor Key for 2FA with Ubiquiti

Whilst passkeys themselves are considered non-phishable, what happens if the place you store that digital key gets compromised? Your iCloud Keychain, for example, or your 1Password account. If you configure and manage these services properly then the likelihood of that happening is extremely remote, but the possibility remains. Let's add something entirely different now, and that's a physical security key:

This is a YubiKey and you can you can store your digital passkey on it. It needs to be purchased and as of today, that's about a US$60 investment for a single key. YubiKeys are called "Universal 2 Factor" or U2F keys and the one above (that's a 5C NFC) can either plug into a device with USB-C or be held next to a phone with NFC (that's "near field communication", a short-range wireless technology that requires devices to be a few centimetres apart). YubiKeys aren't the only makers of U2F keys, but their name has become synonymous with the technology.

Back to Ubiquiti, and when I attempt to remove email authentication, the following prompt stops me dead in my tracks:

I don't want email authentication because that involves sending a code to my email address and, well, we all know what happens when we're relying on people to enter codes into login forms 🤔 So, let's now walk through the Ubiquiti process and add another passkey as a second factor:

But this time, when Chrome pops up and offers to save it in 1Password, I'm going to choose the little USB icon at the top of the prompt instead:

Windows then gives me a prompt to choose where I wish to save the passkey, which is where I choose the security key I've already inserted into my PC:

Each time you begin interacting with a U2F key, it requires a little tap:

And a moment later, my digital passkey has been saved to my physical U2F key:

Just as you can save your passkey to Apple's iCloud Keychain or in 1Password and sync it across your devices, you can also save it to a physical key. And that's precisely what I've now done - saved one Ubiquiti passkey to 1Password and one to my YubiKey. Which means I can now go and remove email authentication, but it does carry a risk:

This is a good point to reflect on the paradox that securing your digital life presents: as we seek stronger forms of authentication, we create different risks. Losing all your forms of non-phishable 2FA, for example, creates the risk of losing access to your account. But we also have mitigating controls: your digital passkey is managed totally independently of your physical one so the chances of losing both are extremely low. Plus, best practice is usually to have two U2F keys and enrol them both (I always take one with me when I travel, and leave another one at home). New levels of security, new risks, new mitigations.

Finding Sites That Support Passkeys

All that's great, but beyond my examples above, who actually supports passkeys?! A rapidly expanding number of services, many of which 1Password has documented in their excellent passkeys.directory website:

Have a look through the list there, and you'll see many very familiar brands. You won't see Ubiquiti as of the time of writing, but I've gone through the "Suggest new listing" process to have them added and will be chatting further with the 1Password folks to see how we can more rapidly populate that list.

Do also take a look at the "Vote for passkeys support" tab and if you see a brand that really should be there, make your voice heard. Hey, here's a good one to start voting for:

Summary

I've deliberately just focused on the mechanics of passkeys in this blog post, but let me take just a moment to highlight important separate but related concepts. Think of passkeys as one part of what we call "defence in depth", that is the application of multiple controls to help keep you safe online. For example, you should still treat emails containing links with a healthy suspicion and whenever in doubt, not click anything and independently navigate to the website in question via your browser. You should still have strong, unique passwords and use a password manager to store them. And you should probably also make sure you're fully awake and not jet lagged in bed before manually entering your credentials into a website your password manager didn't autofill for you 🙂

We're not at the very beginning of passkeys, and we're also not yet quite at the tipping point either... but it's within sight. Just last week, Microsoft announced that new accounts will be passwordless by default, with a preference to using passkeys. Whilst passkeys are by no means perfect, look at what they're replacing! Start using them now on your most essential services and push those that don't support them to genuinely take the security of their customers seriously.

🏷️ My labels
- ❌
Article tags
- ❌
- Security
May 5^th 2025 at 08:12

Troy Hunt
Weekly Update 450
May 2^nd 2025 at 20:52

Weekly Update 450

By: Troy Hunt

Looking back at this week's video, it's the AI discussion that I think about most. More specifically, the view amongst some that any usage of it is bad and every output is "slop". I'm hearing that much more broadly lately, that AI is both "robbing" creators and producing sub-par results. The latter is certainly true in many cases (although it's improving extraordinarily quickly), but the former is just ridiculous when used as a reason not to use AI. After doing this week's video, I saw press of Satya saying that 30% of code in some Microsoft repositories is written by AI; so, are developers in the same boat? Should we go back to writing more code by hand to keep us more employed? Maybe chuck out all the other efficiency tools we use too - IDEs give way to notepad.exe, and so on. It's kinda nuts.

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
NDC Melbourne has been run and done (that's actually the last even on my calendar at present, at last until things start filling in for Europe next month)
We're progressing well with our new Have I Been Pwned challenge coin (but some of the comments about using AI in the process... 😲)
There is a view amongst some that AI just shouldn't be used for things a human could be paid for (I'm sure a similar discussion was had over and over again during the industrial revolution and, well, every other time tech solved a laborious problem)
This Facebook phish was way too convincing (largely due to the shock and emotion it created on first read)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
May 2^nd 2025 at 20:52

The Have I Been Pwned Alpine Grand Tour

By: Troy Hunt

I love a good road trip. Always have, but particularly during COVID when international options were somewhat limited, one road trip ended up, well, "extensive". I also love the recent trips Charlotte and I have taken to spend time with many of the great agencies we've worked with over the years, including the FBI, CISA, CCCS, RCMP, NCA, NCSC UK and NCSC Ireland. So, that's what we're going to do next month across some very cool locations in Europe:

The Have I Been Pwned Alpine Grand Tour

Whilst the route isn't set in stone, we'll start out in Germany and cover Liechtenstein, Switzerland, France, Italy and Austria. We have existing relationships with folks in all but one of those locations (France, call me!) and hope to do some public events as we recently have at Oxford University, Reykjavik and even Perth back on (almost) this side of the world. And that's the reason for writing this post today: if you're in proximity of this route and would like to organise an event or if you're a partner I haven't already reached out to, please get in touch. We usually manage to line up a healthy collection of events and assuming we can do that again on this trip, I'll publish them to the events page shortly. There's also a little bit of availability in Dubai on the way over we'll put to productive use, so definitely reach out if you're over that way.

If you're in another part of the world that needs a visit with a handful of HIBP swag, let me know, there's a bunch of other locations on the short list, and we're always thinking about what's coming next 🌍

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
May 2^nd 2025 at 06:32

Troy Hunt
Welcoming The Gambia National CSIRT to Have I Been Pwned
May 1^st 2025 at 00:29

Welcoming The Gambia National CSIRT to Have I Been Pwned

By: Troy Hunt

Today, we're happy to welcome the Gambia National CSIRT to Have I Been Pwned as the 38th government to be onboarded with full and free access to their government domains. We've been offering this service for seven years now, and it enables national CSIRTs to gain greater visibility into the impact of data breaches on their respective nations.

Our goal at HIBP remains very straightforward: to do good things with data breaches after bad things happen. We hope this initiative helps support the Gambia National CSIRT as it has with many other governments around the world.

🏷️ My labels
- ❌
Article tags
May 1^st 2025 at 00:29

Troy Hunt
Weekly Update 449
April 27^th 2025 at 02:11

Weekly Update 449

By: Troy Hunt

Today, I arrived at my PC first thing in the morning to find the UPS dead (battery was cactus) and the PC obviously without power. So, I tracked down a powerboard and some IEC C14 to mains cable adaptors and powered back up. On boot, neither the Bluetooth mouse nor keyboard worked. So, I tracked down a wired version of each, logged on, didn't find anything weird in the Device Manager, then gave it a reboot, which resulted in the machine not getting past the Lenovo splash screen. So, I rebooted and the same thing happened, unplugged the new USB devices, rebooted again and ended up on the Bitlocker key entry screen. So, on my spare PC I went to my Microsoft account, retrieved the correct key for the disk in question, rebooted and ended up on the recovery screen. So, I ran the recovery process and, much to my surprise, got straight back into Windows.

That's what trying to work out the login / log in / log on / sign in thing was like this week; incrementally shaving the yak until things work and make sense!

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
The new Pwned Passwords search is actually too fast! (settle down, usability isn't as simple as "always make everything as fast as possible")
I went down the "login" rabbit hole and emerged with "sign in" (I still feel this was the most logical conclusion to reach)
Keep those great HIBP UX ideas coming! (May 17 is our go-live date for the new UX, and it's going to be amazing!)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
April 27^th 2025 at 02:11

Troy Hunt
You'll Soon Be Able to Sign in to Have I Been Pwned (but Not Login, Log in or Log On)
April 24^th 2025 at 05:48

You'll Soon Be Able to Sign in to Have I Been Pwned (but Not Login, Log in or Log On)

By: Troy Hunt

How do seemingly little things manage to consume so much time?! We had a suggestion this week that instead of being able to login to the new HIBP website, you should instead be able to log in. This initially confused me because I've been used to logging on to things for decades:

So, I went and signed in (yep, different again) to X and asked the masses what the correct term was:

When accessing your @haveibeenpwned dashboard, which of the following should you do? Preview screen for reference: https://t.co/9gqfr8hZrY
— Troy Hunt (@troyhunt) April 23, 2025

Which didn't result in a conclusive victor, so, I started browsing around.

Cloudflare's Zero Trust docs contain information about customising the login page, which I assume you can do once you log in:

Another, uh, "popular" site prompts you to log in:

After which you're invited to sign in:

You can log in to Canva, which is clearly indicated by the HTML title, which suggests you're on the login page:

You can log on to the Commonwealth Bank down here in Australia:

But the login page for ANZ bank requires to log in, unless you've forgotten your login details:

Ah, but many of these are just the difference between the noun "login" (the page is a thing) and the verb "log in" (when you perform an action), right? Well... depends who you bank with 🤷‍♂️

And maybe you don't log in or login at all:

Finally, from the darkness of seemingly interchangeable terms that may or may not violate principles of English language, emerged a pattern. You also sign in to Google:

And Microsoft:

And Amazon:

And Yahoo:

And, as I mentioned earlier, X:

And now, Have I Been Pwned:

You'll Soon Be Able to Sign in to Have I Been Pwned (but Not Login, Log in or Log On)

There are some notable exceptions (Facebook and ChatGPT, for example), but "sign in" did emerge as the frontrunner among the world's most popular sites. If I really start to overthink it, I do feel that "log[whatever]" implies something different to why we authenticate to systems today and is more a remnant of a bygone era. But frankly, that argument is probably no more valid than whether you're doing a verb thing or a noun thing.

🏷️ My labels
- ❌
Article tags
April 24^th 2025 at 05:48

Troy Hunt
Weekly Update 448
April 22^nd 2025 at 00:20

Weekly Update 448

By: Troy Hunt

I'm a few days late this week, finally back from a month of (almost) non-stop travel with the last bit being completely devoid of an internet connection 😲 And now, the real hard work kicks in as we count down the next 25 days before launching the full HIBP rebrand. I'm adamant we're going to push this out on the 17th of May, and I reckon it's looking absolutely awesome! Do please feel free to check out what we're doing and chime in on the GitHub repository via the links below. I'm sure there's a lot of untapped potential yet to be unlocked.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
I'm speaking at NDC Melbourne on Wednesday 30 (lots of data breachy stuff, unsurprisingly)
The LabHost "phishing as a service" platform has been well and truly pwned by our law enforcement friends (they've sent us over hundreds of thousands of passwords from the now-defunct service that are now searchable in HIBP)
Samsung Germany had more than 200k of their customers' records breached via a third party (this was all allegedly caused by an infostealer infecting a Spectos employee)
Each and every interface is being built in the public domain (that's the live preview link, which is just a static site, but you can click through it and get a really good idea of how it will all look)
We're welcoming feedback via the issues log and discussion list on the open source GitHub repo (lots of good stuff has already come in via there)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
April 22^nd 2025 at 00:20

Troy Hunt
Weekly Update 447
April 12^th 2025 at 05:44

Weekly Update 447

By: Troy Hunt

I'm home! Well, for a day, then it's off to the other side of the country (which I just flew over last night on the way back from Dublin 🤦‍♂️) for an event at the Microsoft Accelerator in Perth on Monday. Such is the path we've taken, but it does provide some awesome opportunities to meet up with folks around the world and see some really interesting stuff. Come by if you're over that way or if you're on the east coast of Aus, I'll be at NDC Melbourne only a couple of weeks later. And somewhere in the midst of all that, we'll get this HIBP UX rebuild finished...

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
I'm speaking at the Microsoft Student Accelerator in Perth on Monday (it's free, and you don't need to be a student 🙂)
We're going to incorporate some more partners into HIBP where they can offer useful services to data breach victims (the thinking is that they'll appear on the dedicated breach page where they can offer something useful as it relates to that specific incident)
The HIBP UX rebuild repo is tracking everything we're doing (chime in on the discussions or submit any issues you find)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
April 12^th 2025 at 05:44

Troy Hunt
Weekly Update 446
April 5^th 2025 at 14:50

Weekly Update 446

By: Troy Hunt

After an unusually long day of travelling from Iceland, we've finally made it to the land of Guinness, Leprechauns, and a tax haven for tech companies. This week, there are a few more lessons from the successful phish against me the previous week, and in happier news, there is some really solid progress on the HIBP UX rebuild. We spent a bunch of time with Stefan and Ingiber (the guy rebuilding the front end) whilst in Reykjavik and now have a very clear plan mapped out to get this finished in the next 6 weeks. More on that in this week's update, enjoy!

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
Silent Push has done some great analysis on the source of my phish (they've linked it similar attacks against SendGrid and Mailgun accounts, among others)
Every outstanding HIBP UX rebuild task is now on public display (we're targeting 17 May to complete all this and roll out the new site)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
April 5^th 2025 at 14:50

Troy Hunt
Weekly Update 445
March 30^th 2025 at 16:48

Weekly Update 445

By: Troy Hunt

Well, this certainly isn't what I expected to be talking about this week! But I think the fact it was someone most people didn't expect to be on the receiving end of an attack like this makes it all the more consumable. I saw a lot of "if it can happen to Troy, it can happen to anyone" sort of commentary and whilst it feels a bit of obnoxious for me to be saying it that way, I appreciate the sentiment and the awareness it drives. It sucked, but I'm going to make damn sure we get a lot of mileage out of this incident as an industry. I've no doubt whatsoever this is a net-positive event that will do way more good than harm. On that note, stay tuned for the promised "Passkeys for Normal People" blog post, I hope to be talking about that in next week's video (travel schedule permitting). For now, here's the full rundown of how I got phished:

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
I obviously didn't like being on the receiving end of this, but I reckon 34 minutes from pwned to public disclosure is a new record 😊 (this is what I'm going to be driving organisations towards in many future data breach cases)
Despite me falling for something I should have spotted, the public response and press had been outstandingly positive (that's a piece from this week's sponsor, I felt their writeup summed things up nicely)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
March 30^th 2025 at 16:48

Troy Hunt
A Sneaky Phish Just Grabbed my Mailchimp Mailing List
March 25^th 2025 at 07:34

A Sneaky Phish Just Grabbed my Mailchimp Mailing List

By: Troy Hunt

You know when you're really jet lagged and really tired and the cogs in your head are just moving that little bit too slow? That's me right now, and the penny has just dropped that a Mailchimp phish has grabbed my credentials, logged into my account and exported the mailing list for this blog. I'm deliberately keeping this post very succinct to ensure the message goes out to my impacted subscribers ASAP, then I'll update the post with more details. But as a quick summary, I woke up in London this morning to the following:

I went to the link which is on mailchimp-sso.com and entered my credentials which - crucially - did not auto-complete from 1Password. I then entered the OTP and the page hung. Moments later, the penny dropped, and I logged onto the official website, which Mailchimp confirmed via a notification email which showed my London IP address:

I immediately changed my password, but not before I got an alert about my mailing list being exported from an IP address in New York:

And, moments after that, the login alert from the same IP:

This was obviously highly automated and designed to immediately export the list before the victim could take preventative measures.

There are approximately 16k records in that export containing info Mailchimp automatically collects and they appear as follows:

[redacted]@gmail.com,Weekly,https://www.troyhunt.com/i-now-own-the-coinhive-domain-heres-how-im-fighting-cryptojacking-and-doing-good-things-with-content-security-policies/#subscribe,2,"2024-04-13 22:03:08",160.154.[redacted].[redacted],"2024-04-13 22:00:50",160.154.[redacted].[redacted],5.[redacted lat],'-4.[redacted long],0,0,Africa/Abidjan,CI,AB,"2024-04-13 22:03:08",130912487,3452386287,,

Every active subscriber on my list will shortly receive an email notification by virtue of this blog post going out. Unfortunately, the export also includes people who've unsubscribed (why does Mailchimp keep these?!) so I'll need to work out how to handle those ones separately. I've been in touch with Mailchimp but don't have a reply yet, I'll update this post with more info when I have it.

I'm enormously frustrated with myself for having fallen for this, and I apologise to anyone on that list. Obviously, watch out for spam or further phishes and check back here or via the social channels in the nav bar above for more. Ironically, I'm in London visiting government partners, and I spent a couple of hours with the National Cyber Security Centre yesterday talking about how we can better promote passkeys, in part due to their phishing-resistant nature. 🤦‍♂️

More soon, I've hit the publish button on this 34 mins after the time stamp in that first email above.

More Stuff From After Initial Publish

Every Monday morning when I'm at home, I head into a radio studio and do a segment on scams. It's consumer-facing so we're talking to the "normies" and whenever someone calls in and talks about being caught in the scam, the sentiment is the same: "I feel so stupid". That, friends, is me right now. Beyond acknowledging my own foolishness, let me proceed with some more thoughts:

Firstly, I've received a gazillion similar phishes before that I've identified early, so what was different about this one? Tiredness, was a major factor. I wasn't alert enough, and I didn't properly think through what I was doing. The attacker had no way of knowing that (I don't have any reason to suspect this was targeted specifically at me), but we all have moments of weakness and if the phish times just perfectly with that, well, here we are.

Secondly, reading it again now, that's a very well-crafted phish. It socially engineered me into believing I wouldn't be able to send out my newsletter so it triggered "fear", but it wasn't all bells and whistles about something terrible happening if I didn't take immediate action. It created just the right amount of urgency without being over the top.

Thirdly, the thing that should have saved my bacon was the credentials not auto-filling from 1Password, so why didn't I stop there? Because that's not unusual. There are so many services where you've registered on one domain (and that address is stored in 1Password), then you legitimately log on to a different domain. For example, here's my Qantas entry:

And the final thought for now is more a frustration that Mailchimp didn't automatically delete the data of people who unsubscribed. There are 7,535 email addresses on that list which is nearly half of all addresses in that export. I need to go through the account settings and see if this was simply a setting I hadn't toggled or something similar, but the inclusion of those addresses was obviously completely unnecessary. I also don't know why IP addresses were captured or how the lat and long is calculated but given I've never seen a prompt for access to the GPS, I imagine it's probably derived from the IP.

I'll park this here and do a deeper technical dive later today that addresses some of the issues I've raised above.

The Technical Bits

I'll keep writing this bit by bit (you may see it appear partly finished while reading, so give the page a refresh later on), starting with the API key that was created:

This has now been deleted so along with rolling the password, there should no longer be any persistent access to the account.

Unfortunately, Mailchimp doesn't offer phishing-resistant 2FA:

By no means would I encourage people not to enable 2FA via OTP, but let this be a lesson as to how completely useless it is against an automated phishing attack that can simply relay the OTP as soon as it's entered. On that note, another ridiculous coincidence is that in the same minute that I fell for this attack, I'd taken a screen cap of the WhatsApp message below and shown Charlotte - "See, this reinforces what we were talking about with the NCSC yesterday about the importance of passkeys":

Another interesting angle to this is the address the phish was sent to:

The rest of that address is probably pretty predictable (and I do publish my full "normal" address on the contact page of this blog, so it's not like I conceal it from the public), but I find it interesting that the phish came to an address only used for Mailchimp. Which leaves two possibilities:

Someone specifically targeted me and knew in advance the pattern I use for the address I sign up to services with. They got it right first go without any mail going to other addresses.
Someone got the address from somewhere else, and I've only ever used it in one place...

Applying some Occam's razor, it's the latter. I find the former highly unlikely, and I'd be very interested to hear from anyone else who uses Mailchimp and received one of these phishes.

Still on email addresses, I originally read the phish on my iThing and Outlook rendered it as you see in the image above. At this point, I was already on the hook as I intended to login and restore my account, so the way the address then rendered on the PC didn't really stand out to me when I switched devices:

That's so damn obvious 🤦‍♂️ The observation here is that by not rendering the sender's address, Outlook on iOS hid the phish. But having said that, by no means can you rely on the address as a solid indicator of authenticity but in this case, it would have helped.

Curious as to why unsubscribed users were in the corpus of exported data, I went searching for answers. At no point does Mailchimp's page on unsubscribing mention anything about not deleting the user's data when they opt out of receiving future emails. Keeping in mind that this is AI-generated, Google provided the following overview:

That "Purpose of Keeping Unsubscribes" section feels particularly icky and again, this is the AI and not Mailchimp's words, but it seems to be on point. I can go through and delete unsubscribed addresses (and I'll do that shortly as the last thing I'm going to do now is rush into something else), but then it looks like that has to be a regular process. This is a massive blindspot on Mailchimp's behalf IMHO and I'm going to provide that feedback to them directly (just remembered I do know some folks there).

I just went to go and check on the phishing site with the expectation of submitting it to Google Safe Browsing, but it looks like that will no longer be necessary:

2 hours and 15 minutes after it snared my creds, Cloudflare has killed the site. I did see a Cloudflare anti-automation widget on the phishing page when it first loaded and later wondered if that was fake or they were genuinely fronting the page, but I guess that question is now answered. I know there'll be calls of "why didn't Cloudflare block this when it was first set up", but I maintain (as I have before in their defence), that it's enormously hard to do that based on domain or page structure alone without creating a heap of false positives.

On the question of the lat and long in the data, I just grabbed my own records and found an IP address belonging to my cellular telco. I had two records (I use them to test both the daily and weekly posts), both with the same IP address and created within a minute of each other. One had a geolocation in Brisbane and the other in far north Queensland, about 1,700km away. In other words, the coords do not pinpoint the location of the subscriber, but the record does contain "australia/brisbane,au,qld" so there's some rough geolocation data in there.

Loading the List into Have I Been Pwned

When I have conversations with breached companies, my messaging is crystal clear: be transparent and expeditious in your reporting of the incident and prioritise communicating with your customers. Me doing anything less than that would be hypocritical, including how I then handle the data from the breach, namely adding it to HIBP. As such, I’ve now loaded the breach and notifications are going out to 6.6k impacted individual subscribers and another 2.4k monitoring domains with impacted email addresses.

Looking for silver linings in the incident, I’m sure I’ll refer this blog post to organisations I disclose future breaches to. I’ll point out in advance that even though the data is “just” email addresses and the risk to individuals doesn’t present a likelihood of serious harm or risk their rights and freedoms (read that blog post for more), it’s simply the right thing to do. In short, for those who read this in future, do not just as I say, but as I do.

The Washup

I emailed a couple of contacts at Mailchimp earlier today and put two questions to them:

Are passkeys on your roadmap
Where does Mailchimp stand on “unsubscribe” not deleting the data

A number of people have commented on social media about the second point possibly being to ensure that someone who unsubscribes can’t then later be resubscribed. I’m not sure that argument makes a lot of sense, but I’d like to see people at least being given the choice. I’m going to wait on their feedback before deciding if I should delete all the unsubscribed emails myself, I’m not even sure if that’s possible via the UI or requires scripting against the API,.

The irony of the timing with this happening just as I’ve been having passkey discussions with the NCSC is something I’m going to treat as an opportunity. Right before this incident, I’d already decided to write a blog post for the normies about passkey, and now I have the perfect example of their value. I’d also discussed with the NCSC about creating a passkey equivalent of my whynohttps.com project which highlighted the largest services not implementing HTTPS by default. As such, I’ve just registered whynopasskeys.com (and its singular equivalent) and will start thinking more about how to build that out so we can collectively put some pressure on the services that don’t support unphishable second factors. I actually attempted to register that domain whilst out walking today, only to be met with the following courtesy of DNSimple:

Using a U2F key on really important stuff (like my domain registrar) highlights the value of this form of auth. Today’s phish could not have happened against this account, nor the other critical ones using a phishing resistant second factor and we need to collectively push orgs in this direction.

Sincere apologies to anyone impacted by this, but on balance I think this will do more good than harm and I encourage everyone to share this experience broadly.

Update 1: I'll keep adding more thoughts here via updates, especially if there's good feedback or questions from the community. One thing I'd intended to add earlier is that the more I ponder this, the more likely I think it is that my unique Mailchimp address was obtained from somewhere as opposed to guessed in any targeted fashion. A possible explanation is the security incident they had in 2022, which largely targeted crypto-related lists, but I imagine would likely have provided access to the email addresses of many more customers too. I'll put that to them when I get a response to my earlier email.

Update 2: I now have an open case with Mailchimp and they've advised that "login and sending for the account have been disabled to help prevent unauthorized use of the account during our investigation". I suspect this explains why some people are unable to now sign up to the newsletter, I'll try and get that reinstated ASAP (I'd rolled creds immediately and let's face it, the horse has already bolted).

Pondering this even further, I wonder if Mailchimp has any anti-automation controls on login? The credentials I entered into the phishing site were obviously automatically replayed to the legitimate site, which suggests something there is lacking.

I also realised another factor that pre-conditioned me to enter credentials into what I thought was Mailchimp is their very short-lived authentication sessions. Every time I go back to the site, I need to re-authenticate and whilst the blame still clearly lies with me, I'm used to logging back in on every visit. Keeping a trusted device auth'd for a longer period would likely have raised a flag on my return to the site if I wasn't still logged in.

Update 3: Mailchimp has now restored access to my account and the newsletter subscription service is working again. Here's what they've said:

We have reviewed the activity and have come to the same conclusion that the unauthorized export and API key from 198.44.136.84 was the scope of the access. Given we know how the access took place, the API key has been deleted, and the password has been reset, we have restored your access to the account.

They've also acknowledged several outstanding questions I have (such as whether passkeys are on the roadmap) and have passed them along to the relevant party. I'll update this post once I have answers.

There's been a lot of discussion around "Mailchimp are violating my local privacy laws by not deleting emails when I unsubscribe", and that's one of the outstanding questions I've sent them. But on that, I've had several people contact me and point out this is not the case as the address needs to be retained in order to ensure an opted-out individual isn't later emailed if their address is imported from another source. Read this explainer from the UK's ICO on suppression lists, in particular this para:

Because we don’t consider that a suppression list is used for direct marketing purposes, there is no automatic right for people to have their information on such a list deleted.

I suspect this explains Mailchimp's position, but I suggest that should be clearer during the unsubscribe process. I just went through and tested it and at no time is it clear the email address will be retained for the purpose of supression:

My suggestion would be to follow our approach for Have I Been Pwned where we give people three choices and allow them to choose how they'd like their data to be handled:

At present, Mailchimp is effectively implementing the first option we provide and the folks that are upset were expecting the last option. Hopefully they'll consider a more self-empowering approach to how people's data is handled, I'll update this blog post once I have their response.

Update 4: Someone has pointed out that the sending email address in the phish actually belongs to a Belgian cleaning company called Group-f. It's not unusual for addresses like this to be used to send malicious mail as they usually don't have a negative reputation and more easily pass through spam filters. It also indicates a possible compromise on their end, so I've now reached to them to report the incident.

Update 5: I've been contacted by someone that runs a well-known website that received the same phishing email as me. They made the following observation regarding the address that received the phish:

We have subscribed to Mailchimp with an address that is only used to subscribe to services, no outgoing communication from us. The phishing emails were delivered to exactly this address, couldn't yet find them on any other address. This makes me very much believe that possibility #2 is the case - they got the address from somewhere.

This aligns with my earlier observation that a customer list may have been obtained from Mailchimp and used to send the phishing emails. They went on to say they were seeing multiple subsequent phishes targeting their Mailchimp account.

Btw, we got some more (Mailchimp) phishing emails today — same style, this time 4 times writing about a new login detected, and once that an abuse report was received and we needed to take immediate action.

That a customer list may have been compromised was one of the questions I put to Mailchimp and am still awaiting an answer on. That was about 36 hours ago now, so I've just given them a little nudge.

Update 6: There have been a lot of suggestions that Mailchimp should be storing the hashes of unsubscribed emails rather than the full addresses in the clear. I understand the sentiment, and it does offer some protection, but it by no means ticks the "we no longer have the address" box. This is merely pseudoanonymisation, and the hashed address can be resolved back to the clear if you have a list of plain text candidates to hash and compare them to. There's a good explainer of this in the answer to this question on Security Stack Exchange about hashing email addresses for GDPR compliance. IMHO, my example of how we handle this in HIBP is the gold standard that Mailchimp should be implementing.

And there's also another problem: short of cracking the hashed addresses, you can never export a list of unsubscribed email addresses, for example, if you wanted to change mail campaign provider. The only way that would work is if the hashing algorithm is the same in the destination service, or you build some other level of abstraction at any other future point where you need to compare plain text values to the hashed impression list. It's messy, very messy.

Update 7: Validin has written a fantastic piece about Pulling the Threads of the Phish of Troy Hunt that takes a deep dive into the relationship between the domain the phish was hosted on and various other campaigns they've observed.

Given these similarities, we believe the phishing attempt of Troy Hunt is very likely Scattered Spider.

Scattered Spider certainly has previous form, and this was a very well-orchestrated phish. Four days on as I write this, it's hard not to be a bit impressed about how slick the whole thing was.

🏷️ My labels
- ❌
March 25^th 2025 at 07:34

Troy Hunt
Weekly Update 444
March 21^st 2025 at 05:37

Weekly Update 444

By: Troy Hunt

It's time to fly! 🇬🇧 🇮🇸 🇮🇪 That's two new flags (or if you're on Windows and can't see flag emojis, that's two new ISO codes) I'll be adding to my "places I've been list" as we start the journey by jetting out to London right after I publish this blog. If you're in the area, I'll be speaking at Oxford University on Wednesday at 17:00 and that's a free and open event. And since recording this morning, we have managed to confirm that I will be speaking at a community event in Reykjavik the following Monday morning, and you'll see a link on my 2025 events page as soon as they make one available. No public events planned for Ireland yet, but if you're in Dublin and would like to run something the week after I'm in Iceland, get in touch. Just to round out a big schedule, I'll be back in Aus speaking in Perth at Microsoft's Student Accelerator on 14 April and then it's off to NDC Melbourne shortly after that for a talk on the 30th. Then rest 🙂

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Cloudflare has found almost half of the passwords people use on their customers' sites are compromised (but somehow, that's not the story that got many people's attention)
Cloudflare's stats were gathered via their leaked credential detection service (one of the sources they use for this is Have I Been Pwned's Pwned Passwords)
And no, a password alone is not personally identifiable information (yes, that's an AI-generated response because, no, you can't find any reference whatsoever to a password being PII in any formal gov docs)
The Lexipol breach went into HIBP (apparently it was carried out by "Puppygirl Hacker Polycule", who'd have thunk it?!)
SpyX also went in (Zack reckons this is the 25th spyware service to be breached since 2017)
We're smashing out front end work for the HIBP UX rebuild (go and check out that repo, submit issues and join in on the discussion, we'd love your input)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
March 21^st 2025 at 05:37

Troy Hunt
Weekly Update 443
March 16^th 2025 at 00:20

Weekly Update 443

By: Troy Hunt

What an awesome response to the new brand! I'm so, so happy with all the feedback, and I've gotta be honest, I was nervous about how it would be received. The only negative theme that came through at all was our use of Sticker Mule, which apparently is akin to being a Tesla owner. Political controversy aside, this has been an extremely well-received launch and I've also loved seeing the issues raised on the open source repo for the front end and Ingiber's (near instant!) addressing of each and every one of them. Please keep that feedback coming, and I'll talk more about some of the changes we've made as a result in the next weekly update.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
We've open sourced the repo with the front end dev work (please feel free to raise issues, chime in on the discussion and submit PRs)
Every commit we make to the above repo is pushed out to a static site at preview.haveibeenpwned.com (remember - it's static - this is front end stuff only)
We're pushing to the preview site using Cloudflare Pages (this is such a cool, easy way of deploying code)
We've made the stickers available via a Sticker Mule store (there's no markup on these, just get 'em at cost)
We've also put the stickers, 3D models and other visual assets in the open source branding repo (especially handy if you want to get stickers made at a place that aligns to your political preference 😝)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
March 16^th 2025 at 00:20

Troy Hunt
Soft-Launching and Open Sourcing the Have I Been Pwned Rebrand
March 11^th 2025 at 19:56

Soft-Launching and Open Sourcing the Have I Been Pwned Rebrand

By: Troy Hunt

Designing the first logo for Have I Been Pwned was easy: I took a SQL injection pattern, wrote "have i been pwned?" after it and then, just to give it a touch of class, put a rectangle with rounded corners around it:

Job done! I mean really, what more did I need for a pet project with a stupid name that would likely only add to the litany of failed nerdy ideas I'd had before that? And then, to compress 11 and a bit years into a single sentence: it immediately became unexpectedly popular, I added an API and a notification service, I said "pwned" before US Congress, I added Pwned Passwords, went through a failed M&A, hired a developer and basically, devoted my life to running this service. There's been some "water under the bridge", so to speak.

The rebrand we're soft-launching today has been a long time coming, and true to that form, we're not rushing it. This is a "soft launch" in that we're sharing work in progress that's sufficiently evolved to put it out there to the public, but you won't see it in production anywhere yet. The website is no different, the social channels still have the same hero shots and avatars etc. This is the time to seek feedback and tweak before committing more effort into writing code and pushing this to the masses.

A quick primer on "why", as the question has come up a few times whilst previously discussing this. Assume for a moment that my valiant 2013 attempt at a logo was, itself, aesthetically sufficient. It's a hard one to use in different use cases (favicon, merch) and it's quite "busy" in it's current form with no easily recognisable symbol which makes it hard to apply to many use cases. And there are loads of use cases; I mentioned a couple just now, but how about in formal documents such a the contracts we write for enterprise customers? Or as it appears on Stripe-generated invoices, stickers, my 3D printed logos, email signatures and so on and so forth. And branding isn't just a logo, it's a whole set of different use cases and variants of the logo and colours such that you have flexibility to present the brand's image in a cohesive, recognisable fashion. Branding is an art form.

At one point there, I'd had a go at redoing the logo myself. It was terrible. You know how you can have this vision of something aesthetic in your mind and know instantly if it's the right thing when you see it, but just can't quite articulate it yourself? I'm like that with interior design... and logos. So, I reached out to Fiverr for help, and immediately regretted it:

I mean... wow. Ok, I get free revisions, let's give the designer another chance:

Dammit! This just wasn't going to work, and we were going to need to make a much more serious commitment if we wanted this done right. So, we went to Luft Design in Norway as Charlotte and Mikael went way back, and with his help, we went around and around through various iterations of mood boards, design styles, colours and carved out time in Oslo during our visit there in December to sit with Stefan as well and really nut this thing out. I was adamant that I wanted something immediately recognisable but also modern and cohesive without being fussy. Basically, give me everything, which Mikael did:

Let me talk you through the logic of these three variations, beginning with the icon. Mikael initially gave us multiple possible variations of a totally different icons which implied different things. My issue with that is you have to know what the symbology means in order for it to make sense. Perhaps if you're starting from scratch that can work, but when you're a decade+ into a name and a brand, there's history that I think you need to carry forward. One of the variations Mikael did reused that original SQL injection pattern I applied to the logo back in 2013 and just for the sake of justifying my choice, here's what it means for the uninitiated:

Take a SQL query like this:

SELECT * From User WHERE Name = 'blah'

Now, imagine "blah" is untrusted user input, that being data that someone submits via a form, for example. They might then change "blah" to the following:

blah';DROP TABLE USER

We'll shortcut the whole SQL injection lesson about validation of untrusted data and parameterization of queries and just jump straight to the resultant query:

SELECT * From User WHERE Name = 'blah';DROP TABLE USER'

And now, due to the additional query appended to the original one, your user table is gone. However... the SQL has a syntax error as there's a rogue apostrophe hanging off the end, so we fix it by using commenting syntax like so:

blah';DROP TABLE USER;--

Chief among the characters in that pattern are these guys:

';--

And that's the history; these are characters that play a role in the form of attack that has led to so many of the breaches in HIBP today. Turns out they're also really easy to stylise and represent as a concise logo:

We agonised over variations of this for months. The problem is that when you think about all the ones that are really recognisable without accompanying words, they're recognisable because the brand is massive. The Nike swoosh, the Mitsubishi diamonds, the Pepsi circle, the Apple logo etc. HIBP obviously doesn't have that level of cachet, but I really like the simplicity of reach of those, and that's what we have with this one as well as that connection to the history of the brand and the practical use of those characters.

But just as with many of those other recognisable logos, these are times when what is effectively just a logo alone isn't enough, so we have the longer form version:

"Have I Been Pwned" is a mouthful. It's not just long to say, it's long to put on the screen, long to print as a sticker, long to put on a shirt and so on and so forth. "Pwned", on the other hand, is short, concise and, I'd argue, has acheived much greater recognition as a word due to HIBP. Reading how “PWNED” went from hacker slang to the internet’s favorite taunt, I think that's a fair conclusion to draw. For a moment, we even toyed with the idea of an actual rename to just "Pwned" and looked at trying to buy pwned.com via a broker which, uh, didn't work out real well:

Appartently, you can put a price on it! So no, we're not renaming anything, we're just providing various stylistic options for representing the logo. This is why we still have the much wordier versions as well:

Unlike old mate at Fiverr, a proper branding exercise like Mikael has done goes well beyond just the logo alone. For example, we have a colour palette:

And we have typography:

Hoodies:

And t-shirts:

You get the idea.

But most importantly, there's the website. Obviously the brand needs to prevail across to the digital realm, but there's also the issue of the front-end tech stack we build on, and that's something I've been thinking about for months now:

In 2013, I built the front end of @haveibeenpwned on Bootstrap and jQuery. In 2025, @stebets and I are rebuilding it as part of a rebrand. What should we use? What are the front end tools that make web dev awesome today? (vanilla HTML, CSS and JS aside, of course)
— Troy Hunt (@troyhunt) December 27, 2024

You can read all sorts of different suggestions in that thread but in the end, we decided to keep it simple:

Bootstrap 5
Vanilla JS (i.e. just write JavaScript without a framework dependency like jQuery)
Sass (which compiles to CSS anyway)

And that's it. Except Stefan and I are busy guys and we really didn't want to invest our precious cycles rebuilding the front end, so we got Ingiber Olafsson to do it. Ingiber came to us via Stefan (so now we have two Icelanders, two Norwegians and... me), and he's been absolutely smashing out the new front end of HIBP:

What I've really enjoyed with Ingiber's approach is that everything he's built is super clean, lightweight and visually beautiful (based on Mikael's work, of course). I've really appreciated his attention to detail that isn't always obvious too, for example making sure accessibility for the visually impaired is maximised:

Ingiber has helped get us to the point where very soon, Stefan and I will begin the integration work to roll the new brand into the main website. That's not just branding work either as the UX is getting a major overhaul. Some stuff is fairly minor: the list of pwned websites is now way too large and we need to have a dedicated page per breach. Other stuff is much more major: we want to have a specific "login" facility (quoting as it will likely remain passwordless by sending a token via email), where we'll then consolidate everything from notification enrolment to domain management to viewing stealer logs. It's a significant paradigm shift that requires a lot of very careful thought.

A quick caveat on the examples above and the others in the repository: we've given Ingiber free reign to experiment and throw ideas around. As a result, we've got some awsome stuff we hadn't thought about before. We've also got some stuff that will be infeasible in the short term, for example, a link through to the official response of the breached company and the full timeline of events. I hope ideas like this keep coming (both from Ingiber and the community), but just keep in mind that some things you see in this repo won't be on the website the day we roll all this out.

As with so much of this project since day one, we're doing this out in the open for everyone to see. Part of that is this blog post heralding what's to come, and part of it is also open sourcing the ux-rebuild repository. I actually created that repo more than a year ago and started crowd-sourcing ideas before closing it off last month whilst Ingiber got working. It's now open again, and I'd like to invite anyone interested to check out what we're building, leave their comments (either here on in the repo), send PRs and so on and so forth. I'm really stoked with the work the guys I've mentioned in this blog post have done, but there will be other great ideas that none of us have thought of yet. And if you come up with something awesome, we already have truckloads of stickers and 3D printed logos I'd love to send you:

So there we have it, that's the rebrand. Do please send us your feedback, not just about logos and look and feel, but also what you'd like to see UX and feature wise on the website. The discussions list on that repo is a great place the chime in or add new ideas, or even just the comments section below 👇

Edit: Wow, all the responses have been awesome! Gotta be honest, I was nervous redefining the brand after so long, but I couldn't have hoped for a better response 😊 I have two quick additions to this post:

Due to popular demand, I've opened a store on Sticker Mule where you can now purchase the stickers. These are listed at their cost price, there's no markup from us, just enjoy them and share liberally.
I should have thought of this before publishing this post, but we've now published the static HTML pages to preview.haveibeenpwned.com. This is running on Cloudflare pages and is auto-deployed on each GitHub merge into main, so you'll see this continue to evolve over the coming weeks.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
March 11^th 2025 at 19:56

Troy Hunt
Weekly Update 442
March 8^th 2025 at 08:00

Weekly Update 442

By: Troy Hunt

We survived the cyclone! That was a seriously weird week with lots of build-up to an event that last occurred before I was born. It'd been 50 years since a cyclone came this far south, and the media was full of alarming predictions of destruction. In the end, we maxed out at 52kts just after I recorded this video:

It’s here. But 47kts max gusts isn’t too bad, nothing actually blowing over here (yet). pic.twitter.com/qFyrZdiyRW
— Troy Hunt (@troyhunt) March 7, 2025

We remained completely untouched and unaffected beyond needing to sweep up some leaves once the rain (which has also been unremarkable), finally stops. It appears the worst damage has been a lot of homes without power and perhaps most obviously, the beaches have done a complete vanishing act with all the sand:

What our favourite beach is like today, versus before. They’ll rebuild it, this isn’t unprecedented, but yeah, there’s some work to be done now. pic.twitter.com/6zFMG7bZqK
— Troy Hunt (@troyhunt) March 8, 2025

But hey, everyone is fine (not just us, the whole city AFAIK), so that's a good outcome. Back on topic, here's this week's video:

References:

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
We're filling in the gaps of the stealer logs that have come before, and doing our best to clean everything up a bit while we're there (but we're never going to have totally "clean" data: GIGO)
Someone tried to phish a PayPal OTP from me and instead faced some great trolling by Elle (so proud 🥲)
Someone also tried to phish my X credentials from me (that one really took some thinking to emphatically put it in the "phish" box)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
March 8^th 2025 at 08:00

Troy Hunt
We're Backfilling and Cleaning Stealer Logs in Have I Been Pwned
March 4^th 2025 at 04:45

We're Backfilling and Cleaning Stealer Logs in Have I Been Pwned

By: Troy Hunt

I think I've finally caught my breath after dealing with those 23 billion rows of stealer logs last week. That was a bit intense, as is usually the way after any large incident goes into HIBP. But the confusing nature of stealer logs coupled with an overtly long blog post explaining them and the conflation of which services needed a subscription versus which were easily accessible by anyone made for a very intense last 6 days. And there were the issues around source data integrity on top of everything else, but I'll come back to that.

When we launched the ability to search through stealer logs last month, that wasn't the first corpus of data from an info stealer we'd loaded, it was just the first time we'd made the website domains they expose searchable. Now that we have an actual model around this, we're going to start going back through those prior incidents and backfilling the new searchable attributes. We've just done that with the 26M unique email address corpus from August last year and added a bunch previously unseen instances of an email address mapped against a website domain. We've also now flagged that incident as "IsStealerLog", so if you're using the API, you'll see that attribute now set to true.

For the most part, that data is all handled just the same as the existing stealer log data: we map email addresses to the domains they've appeared against in the logs then make all that searchable by full email address, email address domain or website domain (read last week's really, really long blog post if you need an explainer on that). But there's one crucial difference that we're applying both to the backfilling and the existing data, and that's related to a bit of cleaning up.

A theme that emerged last week was that there were email addresses that only appeared against one domain, and that was the domain the address itself was on. If john@gmail.com is in there and the only domain he appears against is gmail.com, what's up with that? At face value, John's details have been snared whilst logging on to Gmail, but it doesn't make sense that someone infected with an info stealer only has one website they've logging into captured by the malware. It should be many. This seems to be due to a combination of the source data containing credential stuffing rows (just email and password pairs) amidst info stealer data and somewhere in our processing pipeline, introducing integrity issues due to the odd inputs. Garbage in, garbage out, as they say.

So, we've decided to apply some Occam's razor to the situation and go with the simplest explanation: a single entry for an email address on the domain of that email address is unlikely to indicate an info stealer infection, so we're removing those rows. And not adding any more that meet that criteria. But there's no doubt the email address itself existed in the source; there is no level of integrity issues or parsing errors that causes john@gmail.com to appear out of thin air, so we're not removing the email addresses in the breach, just their mapping to the domain in the stealer log. I'd already explained such a condition in Jan, where there might be an email address in the breach but no corresponding stealer log entry:

The gap is explained by a combination of email addresses that appeared against invalidly formed domains and in some cases, addresses that only appeared with a password and not a domain. Criminals aren't exactly renowned for dumping perfectly formed data sets we can seamlessly work with, and I hope folks that fall into that few percent gap understand this limitation.

FWIW, entries that matched this pattern accounted for 13.6% of all rows in the stealer log table, so this hasn't made a great deal of difference in terms of outright volume.

This takes away a great deal of confusion regarding the infection status of the address owner. As part of this revision, we've updated all the stealer log counts seen on domain search dashboards, so if you're using that feature, you may see the number drop based on the purged data or increase based on the backfilled data. And we're not sending out any additional notifications for backfilled data either; there's a threshold at which comms becomes more noise than signal and I've a strong suspicion that's how it would be received if we started sending emails saying "hey, that stealer log breach from ages ago now has more data".

And that's it. We'll keep backfilling data, and the entire corpus within HIBP is now cleaner and more succinct. And we'll definitely clean up all the UX and website copy as part of our impending rebrand to ensure everything is a lot clearer in the future.

I'll leave you with a bit of levity related to subscription costs and value. As I recently lamented, resellers can be a nightmare to deal with, and we're seriously considering banning them altogether. But occasionally, they inadvertently share more than they should, and we get an insight into how the outside world views the service. Like a recent case where a reseller accidentally sent us the invoice they'd intended to send the customer who wanted to purchase from us, complete with a 131% price markup 😲 It was an annual Pwned 4 subscription that's meant to be $1,370, and simply to buy this on that customer's behalf and then hand them over to us, the reseller was charging $3,165. They can do this because we make the service dirt cheap. How do we know it's dirt cheap? Because another reseller inadvertently sent us this internal communication today:

FWIW, we do have credit cards in Australia, and they work just the same as everywhere else. I still vehemently dislike resellers, but at least our customers are getting a good deal, especially when they buy direct 😊

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
March 4^th 2025 at 04:45

Troy Hunt
Weekly Update 441
February 28^th 2025 at 04:37

Weekly Update 441

By: Troy Hunt

Processing data breaches (especially big ones), can be extremely laborious. And, of course, everyone commenting on them is an expert, so there's a heap of opinions out there. And so it was with the latest stealer logs, a corpus of data that took the better part of a month to process. And then I made things confusing in various ways which led to both Disqus comment and ticket hell. But hey, it's finally out and now it's back to normal breach processing for the foreseeable future 🙂

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
I trawled through 23 billion stealer logs to get a 284M breached email addresses into HIBP (and learned that explaining this concept clearly is hard!)
Apple is pulling support for their Advanced Data Protection E2E offering (but will the status quo change before they force existing users to disable it?)
Spyware / stalkerware apps Cocospu and Spyic leaker their data for all to see (and since that recording, Spyzie has also been added to the list)
The Zimi Senoa IoT switches are beautiful... (...but I think that Bluetooth mesh via a proprietary hub is going to be a show-stopper)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
February 28^th 2025 at 04:37

Troy Hunt
Processing 23 Billion Rows of ALIEN TXTBASE Stealer Logs
February 25^th 2025 at 19:23

Processing 23 Billion Rows of ALIEN TXTBASE Stealer Logs

By: Troy Hunt

I like to start long blog posts with a tl;dr, so here it is:

We've ingested a corpus of 1.5TB worth of stealer logs known as "ALIEN TXTBASE" into Have I Been Pwned. They contain 23 billion rows with 493 million unique website and email address pairs, affecting 284M unique email addresses. We've also added 244M passwords we've never seen before to Pwned Passwords and updated the counts against another 199M that were already in there. Finally, we now have a way for domain owners to query their entire domain for stealer logs and for website operators to identify customers who have had their email addresses snared when entering them into the site. (Note: stealer logs are still freely and easily searchable by individuals, scroll to the bottom for a walkthrough.)

This work has been a month-long saga that began hot off the heels of processing the last massive stash of stealer logs in the middle of Jan. That was the first time we'd ever added more context to stealer logs by way of making the websites email addresses had been logged against searchable. To save me repeating it all here, if you're unfamiliar with stealer logs as a concept and what we've previously done with HIBP, start there.

Up to speed? Good, let's talk about ALIEN TXTBASE.

Origin Story

Last month after loading the aforementioned corpus of data, someone in a government agency reached out and pointed me in the direction of more data by way of two files totalling just over 5GB. Their file names respectively contained the numbers "703" and "704", the word "Alien" and the following text at the beginning of each file:

Pulling the threads, it turned out the Telegram channel referred to contained 744 files of which my contact had come across just the two. The data I'm writing about today is that full corpus, published to Telegram as individual files:

A quick side note on Telegram: There's been growing concern in recent years about the use of Telegram by organised crime, especially since the founder's arrest in France last year for not cracking down on illegal activity on the platform. Telegram makes it super easy to publish large volumes of data (such as we're talking about here) under the veil of anonymity and distribute it en mass. This is just one of many channels involved in cybercrime, but it's noteworthy due to the huge amount of freely accessible data.

The file in the image above contained over 36 million rows of data consisting of website URLs and the email addresses and passwords entered into them. But the file is just a sample - a teaser - with more data available via the subscription options offered in the message. And that's the monetisation route: provide existing data for free, then offer a subscription to feed newly obtained logs to consuming criminals with a desire to exploit the victims again. Again? The stealer logs are obtained in the first place by exploiting the victim's machine, for example:

How do people end up in stealer logs? By doing dumb stuff like this: “Around October I downloaded a pirated version of Adobe AE and after that a trojan got into my pc” pic.twitter.com/igEzOayCu6
— Troy Hunt (@troyhunt) August 5, 2024

So now this guy has malware running on his PC which is siphoning up all his credentials as they're entered into websites. It's those credentials that are then sold in the stealer logs and later used to access the victim's accounts, which is the second exploitation. Pirating software is just one way victims become infected; have a read of this recent case study from our Australian Signals Directorate:

When working from home, Alice remotely accesses the corporate network of her organisation using her personal laptop. Alice downloaded, onto her personal laptop, a version of Notepad++ from a website she believed to be legitimate. An info stealer was disguised as the installer for the Notepad++ software.

When Alice attempted to install the software, the info stealer activated and began harvesting user credentials from her laptop. This included her work username and password, which she had saved in her web browser’s saved logins feature. The info stealer then sent those user credentials to a remote command-and-control server controlled by a cybercriminal group.

Eventually, data like Alice's ends up in places like this Telegram channel and from there, it enables further crimes. From the same ASD article:

Stolen valid user credentials are highly valuable to cybercriminals, because they expedite the initial access to corporate networks and enterprise systems.

So, that's where the data has come from. As I said earlier, ALIEN TXTBASE is by no means the only Telegram channel out there, but it is definitely a major distribution channel.

Verification

When there's a breach of a discrete website, verification of the incident is usually pretty trivial. For example, if Netflix suffered a breach (and I have no indication they have, this is just an example), I can go to their website, head to the password reset field, enter a made-up email address and see a response like this:

On the other hand, an address that does exist on the service usually returns a message to the effect of "we've sent you a password reset email". This is called an "enumeration vector" in that it enables you to enumerate through a list of email addresses and find out which ones have an account on the site.

But stealer logs don't come from a single source like Netflix, instead they contain the credentials for a whole range of different sites visited by people infected with malware. However, I can still take lines from the stealer logs that were captured against Netflix and test the email addresses. (Note: I never test if the password is valid, that would be a privacy violation that constitutes unauthorised access and besides, as you'll read next, there's simply no need to.)

Initially, I actually ran into a bit of a roadblock when testing this:

I found this over and over again so, I went back and checked the source data and inspected this poor victim's record:

Their Netflix credentials were snared when they were entered into the website with a path of "/ph-en/login", implying they're likely Filipino. Let's try VPN'ing into Manilla:

And suddenly, a password reset gives me exactly what I need:

That's a little tangent from stealer logs, but Netflix obviously applies some geo-fencing logic to certain features. This actually worked out better than expected verification-wise because not only was I able to confirm the presence of the email address on their service, but that the stealer log entry placing them in the Philippines was also geographically correct. It was reproducible too: when I saw "something went wrong", but the path began with "mx", I VPN'd into Mexico City and Netflix happily confirmed the reset email was sent. Another path had "ve", so it was off to Caracas and the Venezuelan victim's account was confirmed. You get the idea. So, strong signal on confirmation of account existence via password reset, now let's also try something more personal.

I emailed a handful of HIBP subscribers and asked for their support verifying a breach. I got a fast, willing response from one guy and sent over more than 1,100 rows of data against his address 😲 It's fascinating what you can tell about a person from stealer log data: He's obviously German based on the presence of websites with a .de address, and he uses all the usual stuff most of us do (Amazon, eBay, LinkedIn, Airbnb). But it's the less common ones that make for more interesting reading: He drives a Mercedes because he's been logging into an address there for owners, and it also appears he likes whisky given his account at an online specialist. He's a Firefox user, as he's logged in there too, and he seems to be a techie as he's logged into Seagate and a site selling some very specialised electrical testing equipment. But is it legit?

Imagine the heart-in-mouth moment the poor guy must have had seeing his digital life laid out in front of him like that and knowing criminals have this data. It'd be a hell of a shock.

Having said all that, whilst I'm confident there's a large volume of legitimate data in this corpus, it's also possible there will be junk. Fabricated email addresses, websites that were never used, etc. I hope folks who experience this can appreciate how hard it is for us to discern "legitimate" stealer logs from those that were made up. We've done as much parsing and validation as possible, but we have no way of knowing if someone@yourdomain.com is an email address that actually exists or if it does, if they ever actually used Netflix or Spotify or whatever. They're just strings of data, and all we can do is report them as found.

Searching Entries Against Your Website with a Pwned 5 Subscription

When we published the stealer logs last month, I kept caveating everything with "experimental". Even the first word of the blog post title was "experimenting", simply because we didn't know how this would be received. Would people welcome the additional data we were making available? Or find it unactionable noise? It turns out it was entirely the former, and I didn't see a single negative comment or, as it often has been in the past with stealer logs or malware breaches, angry victims demanding I send their entire row(s) of data. And I guess that makes sense given what we made available so, starting today, we're making even more available!

First, a bit of nomenclature. Consider the following stealer log entry:

https://www.netflix.com/en-ph/login:john@gmail.com:P@ssw0rd

There are four parts to this entry that are relevant to the HIBP services I'm going to write about here:

			Email Address
	Website Domain		Email Alias		Email Domain
https://	www.netflix.com	/en-ph/login	john	@	example.com	P@ssw0rd

Last month, we added functionality to the UI such that after verifying your email address you could see a collection of website domains. In the example above, this meant that John could use the free notification service to verify control of his email address after which he'd see www.netflix.com listed. (Note: we're presently totally redesigning this as part of our UX rebuild and it'll be much smoother in the very near future.) Likewise, we introduced an API to do exactly the same thing, but only after verifying control of the email domain. So, in the case above, the owner of example.com would be able to query john@example.com and get back www.netflix.com (along with any other website domains poor John had been using).

Today, we're introducing two new APIs, and they're big ones:

Query stealer logs by email domain
Query stealer logs by website domain

The first one is akin to our existing domain search feature so in the example above, the owner of the domain could query the stealer logs for example.com and get back each email address alias and the website domains they appear against. Here's what that output looks like:

{
  "john": [
    "netflix.com"
  ],
  "jane": [
    "netflix.com",
    "spotify.com"
  ]
}

The previous model only allowed querying by email address, so you could end up with an organisation needing to iterate through thousands of individual API requests. This model means that can now be done in a single request, which will make life much easier for larger organisations assessing the exposure of their workforce.

The second new API is designed for website operators who want to identify customers who've had their credentials snared at login. So, in our demo row above, Netflix could query www.netflix.com (after verifying control of the domain, of course) and retrieve a list of their customers from the stealer logs:

[
  "john@example.com",
  "jane@yahoo.com"
]

Both these new APIs are orientated towards larger organisations and can return vast volumes of data. When considering how to price this service, the simplest, most commensurate model we arrived at was to use a pricing tier we already had: Pwned 5:

Whilst we'd previously only ever listed tiers 1 through 4, we'd always had higher tiers sitting there in the background for organisations needing higher rate limits. Surfacing this subscription and adding the ability to query stealer logs via these two new APIs makes it easy for new and existing subscribers alike to unlock the feature. And if you are an existing subscriber, the price is simply adjusted pro rata at Stripe's end such that your existing balance is carried forward. Per the above image, this subscription is available either monthly or annually so if you just want to see what's in the current corpus of data and keep the cost down, take it for a month then cancel it. (Note: the Pwned 5 subscription is also now required for the API to search by email address we launched last month, but the web UI that uses the notification service to show stealer log results by email is absolutely still free and will remain that way.)

Another small addition since last month is that we've added an "IsStealerLog" flag on the breach model. This means that anyone programmatically dealing with data in HIBP can now easily elect to handle stealer logs differently than other breaches. For example, a new breach with this flag set to "true" might then trigger a query of the new API endpoints to search by domain so that an organisation can update their records with any new stealer log entries.

Anyone searching by email domain already knows the scope of addresses on their domain as it's reported on their dashboard. Plus, when email notifications are sent on breach load it tells you exactly how many new addresses from your domains are in the breach. Starting today, we've also added a column to explain how many email addresses appear against your website domain:

In other words, 3 people have had their email address grabbed by an info stealer when logging on to hibp-integration-tests.com, and the new API will return all of those addresses. It is only API-based for the moment, we'll consider if a UI makes sense as part of the rebranded site launch, it may not becuase of the potentially huge volumes of data.

Just one last thing: for the two new APIs that query by domain, we've set a rate limit which is entirely independent of the rate limit on, say, breached account searches. Whilst a Pwned 5 subscription would allow 1,000 requests to that API every minute, it's significantly more restricted when hitting those two new stealer log APIs. We haven't published a number as I expect we'll tweak it a bit based on usage, but it's more than enough for any normal use of the service whilst ensuring we don't get completely overwhelmed by high-overhead searches. The stealer log API that queries by email address inherits the 1,000 RPM rate limit of the Pwned 5 subscription.

We've Added 244M New Passwords to Pwned Passwords

One of the coolest most awesome best things we've ever done with HIBP is to create a massive repository of passwords that's all open source (both code and data) and can be queried anonymously without disclosing the searched password. Pwned Passwords is amazing, and it has gained huge traction:

There it is - we’ve now passed 10,000,000,000 requests to Pwned Password in 30 days 😮 This is made possible with @Cloudflare’s support, massively edge caching the data to make it super fast and highly available for everyone. pic.twitter.com/kw3C9gsHmB
— Troy Hunt (@troyhunt) October 5, 2024

10 billion times a month, our API helps a service somewhere assist one of their customers with making a good password choice. And that's just the API - we have no idea the full scope of the service as having the data open source means people can just download the entire corpus and run it themselves.

Per the opening para, we now have an additional 244 million previously unseen passwords in this corpus. And, as always, they make for some fun reading:

tender-kangaroo
swimmingkangaroo59
gentlekangaroo
CaptainKangaroo340

And, uh, some kangaroos doing other stuff I can't really repeat here. Those passwords are at the final stages of loading and should flush through cache to Cloudflare's hundreds of edge nodes in the next few hours. That's another quarter of a billion that join the list of previously breached ones, whilst 199 million we'd already seen before have had their prevalence counts upped.

HIBP in Practice

It's amazing to see where my little pet project with the stupid name has gone, and nobody is more surprised than me when it pops up in cool places. Looking around for some stealer log references while writing this blog post, I came across this one:

This was already in place when you created a new account or updated your password. But now it's also verified on every login against the live HIBP database. Hats off to the tremendous service HIBP provides to the internet 🙏 https://t.co/Z61AgDaL2t
— DHH (@dhh) February 4, 2025

That's awesome! That's exactly the sort of use case that speaks to the motto of "do good things with breach data after bad things happen". By adding this latest trove of data, the folks using Basecamp will immediately benefit simply by virtue of the service being plugged into our API. And sidenote: David has done some amazing stuff in the past so I was especially excited to see this shout-out 😊

This one is a similar story, albeit using Pwned Passwords:

Their service is phenomenal! We also inform users in our product if they set/change their password to a known password that has been hacked. Admins have the option to not allow for users to use these passwords, if they wish. pic.twitter.com/bvLfYm9xzH
— Brad Marshall (@iamBMarshall) February 4, 2025

Inevitably, those requests form a slice of the 10 billion monthly we see that are now able to identify a quarter of a billion more compromised ones and hopefully, keep them out of harm's way.

For many organisations, the data we're making available via stealer logs is the missing piece of the puzzle that explains patterns that were previously unexplainable:

Gotta say I’m pretty happy with what we did with stealer logs last week, think we’re gonna need to do more of this 😎 pic.twitter.com/4rMaMmL8LU
— Troy Hunt (@troyhunt) January 21, 2025

I've had so many emails to that effect after loading various stealer logs over the years. The constant theme I love hearing is how it benefits the good guys and, well, not so much the bad guys:

I love it when @haveibeenpwned screws over the bad guys 😎 pic.twitter.com/I29aMhwClW
— Troy Hunt (@troyhunt) January 16, 2025

The introduction of these new APIs today will finally help many organisations identify the source of malicious activity and even more importantly, get ahead of it and block it before it does damange. Whilst there won't be any set cadence to the addition of more stealer logs (obviously, we can't really predict when this stuff will emerge), I have no doubt we'll continue to add a lot more data yet.

Techie Bits

Processing this data has been non-trivial to say the least, so I thought I'd give you a bit of an overview of how I ultimately approached it. In essence, we're talking about transforming a very large amount of highly redundant text-based data and ultimately ending up with a distinct list of email addresses against website domains. Here's the high-level process:

Start with 744 files of logs totalling 1.5TB and containing 23 billion rows
Extract all the unique email addresses from the entire corpus of stealer logs using the open source Email Address Extractor tool (284M rows)
In a .NET console app, process each file in point 1 and extract the domain and email address from valid lines (domain, email and password, all colon delimited) to produce 744 files totalling 390GB (9.7B rows)
In another .NET console app, consolidate all 744 files from the previous point into a single file with a distinct set of website domain and email address pairs (789M rows)
Take the file from the previous point, and in another .NET console app, extract all the unique domains (18M rows)
Use SQL BCP to upload the files from the two previous points to SQL Azure
Insert any new domains that don't already exist in HIBP (these are held in a dedicated table) via a TSQL statement (6.7M rows)
Upload the 284M unique email addresses like with a typical data breach (69% of them were already in HIBP)
Join the distinct list of domains and email addresses to the data uploaded in the previous point and insert email and domain pairs into the stealer log table, but only if they haven't been seen before via another TSQL statement (493M rows)
Wait - if I took distinct website and email pairs in step 4 and got 789M rows then in step 9 it only inserted 493M, what happened? There were 220M rows already in HIBP from last month which will account for some of the gap (existing records aren't reinserted), and there was also some additional validation in SQL Server courtesy of code we only have in that environment. The remaining gap is explained by the .NET code not ignoring case on distinct, so in other words, I dumped and uploaded way more data than I had to and made SQL Sever do extra work 🤦‍♂️
Go live and get 🍺

It was actually much, much harder than this due to the trials and errors of attempting to distil what starts out as tens of billions of rows down into hundreds of millions of unique examples. I was initially doing a lot more of the processing in SQL Azure because hey, it's just cloud and you can turn up the performance as much as you want, right?! After running 80 cores of Azure SQL Hyperscale for days on end ($ouch$) and seeing no end in sight to the execution of queries intended to take distinct values across billions of rows, I fell back to local processing with .NET. You could, of course, use all sorts of other technologies of choice here, but it turned out that local processing with some hand-rolled code was way more efficient than delegating the task to a cloud-based RDBMS. The space used by the database tells a big part of the story:

As I've said many times before, the cloud, my friends, is not always the answer. Do as much processing as possible locally on sunk-cost hardware unless there's a compelling reason not to.

I've detailed all this here in part because that's what I've always done with this project over the last 11 and a bit years, but also to illustrate how much time, effort, and money is burned processing data like this. It's very non-trivial, especially not when everything has to ultimately go into an increasingly large system with loads of external dependencies and be fast, reliable and cost-effective.

Conclusion

From 23 billion raw rows down to a much more manageable and discrete set of data, the latest stealer logs are now searchable via all the ways you've done for years, plus those two new domain-based stealer log APIs. We've called this breach "ALIEN TXTBASE Stealer Logs", let's see what positive outcomes we can all now produce with the data.

Edit 1: Let me re-emphasise an important point from the blog post I think got a bit buried: The web UI that uses the notification service to show stealer log results by email is absolutely still free and will remain that way. If you’ve got an email address in this breach and you want to see the stealer log domains against it, do this:

Go to the notification page: https://haveibeenpwned.com/NotifyMe
Fill in your email address and send yourself the verification email
Click the link emailed to you and scroll to the bottom of the page where you'll find a message similar to this:

We need to make this clearer as it's obviously confusing. Thanks for everyone's feedback, we're working on it.

Edit 2: I've just published a (much shorter!) blog post that addresses a common theme in the comments regarding email addresses that only appear against a single website domain, being the same domain as the email address itself is on. Check that out if that's you.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
February 25^th 2025 at 19:23

Troy Hunt
Weekly Update 440
February 25^th 2025 at 08:17

Weekly Update 440

By: Troy Hunt

Wait - it's Tuesday already?! When you listen to this week's (ok, last week's) video, you'll probably get the sense I was a bit overloaded. Yeah, so that didn't stop, and the stealer log processing and new feature building just absolutely swamped me. Plus, I spent from then until now in Sydney at various meetings and events which was great, but didn't do a lot for my productivity. Be that as it may, we're now less than 12 hours off launching this all so, in the interests of not having me stay up all night putting the finishing touches on it, let me drop here and come back in a few days to talk about how it's all been received 🤞

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
February 25^th 2025 at 08:17

Troy Hunt
Weekly Update 439
February 16^th 2025 at 06:10

Weekly Update 439

By: Troy Hunt

We're now eyeball-deep into the HIBP rebrand and UX work, totally overhauling the image of the service as we know it. That said, a guiding principle has been to ensure the new looks is immediately recognisable and over months of work, I think we've achieved that. I'm holding off sharing anything until we're far enough down the road that we're confident in the direction we're heading, and then I want to invite the masses to contribute as we head towards a (re)launch.

Whilst I didn't talk about it in this week's video, let me just recap on why we're doing this: the decisions made for a pet project nearly 12 years ago now are very different to the decisions made for a mainstream service with so many dependencies on it today. We're at a point where we need more professionalism and cohesion and that's across everything from the website design and content, the branding on our formal documentation, the stickers I hand out all over the place, the swag we want to make and even the signatures on our emails. Our task is to keep the heart and soul of a humble community-first project whilst simultaneously making sure it actually looks like we know what we're doing 🙂

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Authorised access by DOGE employees is not a data breach (no, not even if you really, really, really don't like Donald and Elon)
The HIBP rebrand is now a long way through, and we'd love to hear your ideas (it's not just the look and feel, I want to get a lot more functionality in there)
The latest Zacks breach went into HIBP (that's right, this isn't their first rodeo)
Apparently, our discussion about possibly banning resellers is newsworthy (and this isn't a done deal yet, we are also looking at the feasibility of automating away the pain)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
February 16^th 2025 at 06:10

Troy Hunt
Weekly Update 438
February 9^th 2025 at 07:04

Weekly Update 438

By: Troy Hunt

I think what's really scratching an itch for me with the home theatre thing is that it's this whole geeky world of stuff that I always knew was out there, but I'd just never really understood. For example, I mentioned waveforming in the video, and I'd never even heard of that let alone understood that there may be science where sound waves are smashed into each other in opposing directions in order to cancel each other out. And I'm sure I've got that completely wrong, but that's what's so fun about this! Anyway, that's all just part of the next adventure, and I hope you enjoy hearing about it and sending over your thoughts because I'm pretty sure there's a gazillion things I don't know yet 🙂

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
We're going down the home theatre rabbit hole! (check out some of the work these guys have done, just amazing)
We're seriously considering booting resellers off HIBP altogether (0.86% of our customers who come through them are consuming the same amount of support time as the entire remaining 99.14% 😲)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
February 9^th 2025 at 07:04

Troy Hunt
Weekly Update 437
February 2^nd 2025 at 05:15

Weekly Update 437

By: Troy Hunt

It's IoT time! We're embarking on a very major home project (more detail of which is in the video), and some pretty big decisions need to be made about a very simple device: the light switch. I love having just about every light in our connected... when it works. The house has just the right light early each morning, it transitions into daytime mode right at the perfect time based on the amount of solar radiation in the sky, into evening time courtesy of the same device and then blacks out when we go to bed. And some lights come on with movement based on motion sensors in fans (Big Ass fans have occupancy sensors), cameras (Ubiquiti camera raise motion events), and tiny dedicated Zigbee sensors. But getting the right physical switches in combination with the right IoT relays has been a bit more challenging. Listen to this week's show let me know if you have any "bright" ideas 🙂

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Light switches, IoT relays and other complex discussions about simple circuits (it's such a critical component of the house, especially when you replicate the model >100 times over)
Apparently, the YubiKey phish wasn't a phish (seriously folks, if I can't tell when comms is legit or not, how are the normies expected to get it right?!)
The ABC's analysis of 4-digit PINs in HIBP is really well done! (although I did spend way too much time explaining to other journalists how there are only 10,000 possible values 🤔)
The HIBP Grafana dashboard is looking epic! (although I may be blowing way more time on it than anyone could reasonably justify...)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
February 2^nd 2025 at 05:15

Troy Hunt
Weekly Update 436
January 25^th 2025 at 04:42

Weekly Update 436

By: Troy Hunt

We're heading back to London! And making a trip to Reykjavik. And Dublin. I talked about us considering this in the video yesterday, and just before publishing this post, we pulled the trigger and booked the tickets. The plan is to pretty much repeat the US and Canada trip we did in September and spend the time meeting up with some of the law enforcement agencies and various other organisations we've been working with over the years. As I say in the video, if you're in one of these locations and are in a position to stand up a meetup or user group session, I'd love to hear from you. Europe is a hell of a long way to go so we do want to make the most of the travel, stand by for more plans as they emerge.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
The HIBP "Wall of Graphs" looks awesome! (I'll blog it up, but there's more to be done first)
Spamming ~500 companies attempting to look for bug bounties is muppet behaviour (all whilst putting them on CC too 🤦‍♂️)
Despite a couple of dissenting voices re the muppet characterisation, 84.5% of people agreed with my description (or in other words, 15.5% of people were completely wrong)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
January 25^th 2025 at 04:42

Troy Hunt
You Can't Trust Hackers, and Other Data Breach Verification Tales
January 23^rd 2025 at 03:14

You Can't Trust Hackers, and Other Data Breach Verification Tales

By: Troy Hunt

It's hard to find a good criminal these days. I mean a really trustworthy one you can be confident won't lead you up the garden path with false promises of data breaches. Like this guy yesterday:

For my international friends, JB Hi-Fi is a massive electronics retailer down under and they have my data! I mean by design because I've bought a bunch of stuff from them, so I was curious not just about my own data but because a breach of 12 million plus people would be massive in a country of not much more than double that. So, I dropped the guy a message and asked if he'd be willing to help me verify the incident by sharing my own record. I didn't want to post any public commentary about this incident until I had a reasonable degree of confidence it was legit, not given how much impact it could have in my very own backyard.

Now, I wouldn't normally share a private conversation with another party, but when someone sets out to scam people, that rule goes out the window as far as I'm concerned. So here's where the conversation got interesting:

He guaranteed it for me! Sounds legit. But hey, everyone gets the benefit of the doubt until proven otherwise, so I started looking at the data. It turns out my own info wasn't in the full set, but he was happy to provide a few thousand sample records with 14 columns:

customer_id_
first_name
last_name
FullName
gender
email_address_
mobile_country_
mobile_number_
dob
postal_street_1_
state_
postal_code_
city_
account_status

Pretty standard stuff, could be legit, let's check. I have a little Powershell script I run against the HIBP API when a new alleged breach comes in and I want to get a really good sense of how unique it is. It simply loops through all the email addresses in a file, checks which breaches they've been in and keeps track of the percentage that have been seen before. A unique breach will have anywhere from about 40% to 80% previously seen addresses, but this one had, well, more:

Spot the trend? Every single address has one breach in common. Hmmm... wonder what the guy has to say about that?

But he was in the server! And he grabbed it from the dashboard of Shopify! Must be legit, unless... what if I compared it to the actual full breach of Dymocks? That's a local Aussie bookseller (so it would have a lot of Aussie-looking email addresses in it, just like JB Hi-Fi would), and their breach dated back to mid-2023. I keep breaches like that on hand for just such occasions, let's compare the two:

Wow! What are the chances?! He's going to be so interested when he hears about this!

And that was it. The chat went silent and very shortly after, the listing was gone:

It looks like the bloke has also since been booted off the forum where he tried to run the scam so yeah, this one didn't work out great for him. That $16k would have been so tasty too!

I wrote this short post to highlight how important verification of data breach claims is. Obviously, I've seen loads of legitimate ones but I've also seen a lot of rubbish. Not usually this blatant where the party contacting me is making such demonstrably false claims about their own exploits, but very regularly from people who obtain something from another party and repeat the lie they've been told. This example also highlights how useful data from previous breaches is, even after the email addresses have been extracted and loaded into HIBP. Data is so often recycled and shipped around as something new, this was just a textbook perfect case of making use of a previous incident to disprove a new claim. Plus, it's kinda fun poking holes in a scamming criminal's claims 😊

🏷️ My labels
- ❌
Article tags
- ❌
- Security
January 23^rd 2025 at 03:14

Troy Hunt
Weekly Update 435
January 21^st 2025 at 02:14

Weekly Update 435

By: Troy Hunt

If I'm honest, I was in two minds about adding additional stealer logs to HIBP. Even with the new feature to include the domains an email address appears against in the logs, my concern was that I'd get a barrage of "that's useless information" messages like I normally do when I load stealer logs! Instead, the feedback was resoundingly positive. This week I'm talking more about the logic behind this, some of the challenges we faced with it and what we might see in the future. Stay tuned, because I think we're going to be seeing a lot more of this in HIBP.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
For the first time ever, we added a heap of additional info about stealer logs to HIBP (ok, it's just the domains an address appears against, but that turns out to have been really useful)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
January 21^st 2025 at 02:14

Troy Hunt
Experimenting with Stealer Logs in Have I Been Pwned
January 13^th 2025 at 19:48

Experimenting with Stealer Logs in Have I Been Pwned

By: Troy Hunt

TL;DR — Email addresses in stealer logs can now be queried in HIBP to discover which websites they've had credentials exposed against. Individuals can see this by verifying their address using the notification service and organisations monitoring domains can pull a list back via a new API.

Nasty stuff, stealer logs. I've written about them and loaded them into Have I Been Pwned (HIBP) before but just as a recap, we're talking about the logs created by malware running on infected machines. You know that game cheat you downloaded? Or that crack for the pirated software product? Or the video of your colleague doing something that sounded crazy but you thought you'd better download and run that executable program showing it just to be sure? That's just a few different ways you end up with malware on your machine that then watches what you're doing and logs it, just like this:

These logs all came from the same person and each time the poor bloke visited a website and logged in, the malware snared the URL, his email address and his password. It's akin to a criminal looking over his shoulder and writing down the credentials for every service he's using, except rather than it being one shoulder-surfing bad guy, it's somewhat larger than that. We're talking about billions of records of stealer logs floating around, often published via Telegram where they're easily accessible to the masses. Check out Bitsight's piece titled Exfiltration over Telegram Bots: Skidding Infostealer Logs if you'd like to get into the weeds of how and why this happens. Or, for a really quick snapshot, here's an example that popped up on Telegram as I was writing this post:

As it relates to HIBP, stealer logs have always presented a bit of a paradox: they contain huge troves of personal information that by any reasonable measure constitute a data breach that victims would like to know about, but then what can they actually do about it? What are the websites listed against their email address? And what password was used? Reading the comments from the blog post in the first para, you can sense the frustration; people want more info and merely saying "your email address appeared in stealer logs" has left many feeling more frustrated than informed. I've been giving that a lot of thought over recent months and today, we're going to take a big step towards addressing that concern:

The domains an email address appears next to in stealer logs can now be returned to authorised users.

This means the guy with the Gmail address from the screen grab above can now see that his address has appeared against Amazon, Facebook and H&R Block. Further, his password is also searchable in Pwned Passwords so every piece of info we have from the stealer log is now accessible to him. Let me explain the mechanics of this:

Firstly, the volumes of data we're talking about are immense. In the case of the most recent corpus of data I was sent, there are hundreds of text files with well over 100GB of data and billions of rows. Filtering it all down, we ended up with 220 million unique rows of email address and domain pairs covering 69 million of the total 71 million email addresses in the data. The gap is explained by a combination of email addresses that appeared against invalidly formed domains and in some cases, addresses that only appeared with a password and not a domain. Criminals aren't exactly renowned for dumping perfectly formed data sets we can seamlessly work with, and I hope folks that fall into that few percent gap understand this limitation.

So, we now have 220 million records of email addresses against domains, how do we surface that information? Keeping in mind that "experimental" caveat in the title, the first decision we made is that it should only be accessible to the following parties:

The person who owns the email address
The company that owns the domain the email address is on

At face value it might look like that first point deviates from the current model of just entering an email address on the front page of the site and getting back a result (and there are very good reasons why the service works this way). There are some important differences though, the first of which is that whilst your classic email address search on HIBP returns verified breaches of specific services, stealer logs contain a list of services that have never have been breached. It means we're talking about much larger numbers that build up far richer profiles; instead of a few breached services someone used, we're talking about potentially hundreds of them. Secondly, many of the services that appear next to email addresses in the stealer logs are precisely the sort of thing we flag as sensitive and hide from public view. There's a heap of Pornhub. There are health-related services. Religious one. Political websites. There are a lot of services there that merely by association constitute sensitive information, and we just don't want to take the risk of showing that info to the masses.

The second point means that companies doing domain searches (for which they already need to prove control of the domain), can pull back the list of the websites people in their organisation have email addresses next to. When the company controls the domain, they also control the email addresses on that domain and by extension, have the technical ability to view messages sent to their mailbox. Whether they have policies prohibiting this is a different story but remember, your work email address is your work's email address! They can already see the services sending emails to their people, and in the case of stealer logs, this is likely to be enormously useful information as it relates to protecting the organisation. I ran a few big names through the data, and even I was shocked at the prevalence of corporate email addresses against services you wouldn't expect to be used in the workplace (then again, using the corp email address in places you definitely shouldn't be isn't exactly anything new). That in itself is an issue, then there's the question of whether these logs came from an infected corporate machine or from someone entering their work email address into their personal device.

I started thinking more about what you can learn about an organisation's exposure in these logs, so I grabbed a well-known brand in the Fortune 500. Here are some of the highlights:

2,850 unique corporate email addresses in the stealer logs
3,159 instances of an address against a service they use, accompanied by a password (some email addresses appeared multiple times)
The top domains included paypal.com, netflix.com, amazon.com and facebook.com (likely within the scope of acceptable corporate use)
The top domains also included steamcommunity.com, roblox.com and battle.net (all gaming websites likely not within scope of acceptable use)
Dozens of domains containing the words "porn", "adult" or "xxx" (definitely not within scope!)
Dozens more domains containing the corporate brand, either as subdomains of their primary domain or org-specific subdomains of other services including Udemy (online learning), Amplify ("strategy execution platform"), Microsoft Azure (the same cloud platform that HIBP runs on) and Salesforce (needs no introduction)

That said, let me emphasise a critical point:

This data is prepared and sold by criminals who provide zero guarantees as to its accuracy. The only guarantee is that the presence of an email address next to a domain is precisely what's in the stealer log; the owner of the address may never have actually visited the indicated website.

Stealer logs are not like typical data breaches where it's a discrete incident leading to the dumping of customers of a specific service. I know that the presence of my personal email address in the LinkedIn and Dropbox data breaches, for example, is a near-ironclad indication that those services exposed my data. Stealer logs don't provide that guarantee, so please understand this when reviewing the data.

The way we've decided to implement these two use cases differs:

Individuals who can verify they control their email address can use the free notification service. This is already how people can view sensitive data breaches against their address.
Organisations monitoring domains can call a new API by email address. They'll need to have verified control of the domain the address is on and have an appropriately sized subscription (essentially what's already required to search the domain).

We'll make the individual searches cleaner in the near future as part of the rebrand I've recently been talking about. For now, here's what it looks like:

Because of the recirculation of many stealer logs, we're not tracking which domains appeared against which breaches in HIBP. Depending on how this experiment with stealer logs goes, we'll likely add more in the future (and fill in the domain data for existing stealer logs in HIBP), but additional domains will only appear in the screen above if they haven't already been seen.

We've done the searches by domain owners via API as we're talking about potentially huge volumes of data that really don't scale well to the browser experience. Imagine a company with tens or hundreds of thousands of breached addresses and then a whole heap of those addresses have a bunch of stealer log entries against them. Further, by putting this behind a per-email address API rather than automatically showing it on domain search means it's easy for an org to not see these results, which I suspect some will elect to do for privacy reasons. The API approach was easiest while we explore this service then we can build on that based on feedback. I mentioned this was experimental, right? For now, it looks like this:

Lastly, there's another opportunity altogether that loading stealer logs in this fashion opens up, and the penny dropped when I loaded that last one mentioned earlier. I was contacted by a couple of different organisations that explained how around the time the data I'd loaded was circulating, they were seeing an uptick in account takeovers "and the attackers were getting the password right first go every time!" Using HIBP to try and understand where impacted customers might have been exposed, they posited that it was possible the same stealer logs I had were being used by criminals to extract every account that had logged onto their service. So, we started delving into the data and sure enough, all the other email addresses against their domain aligned with customers who were suffering from account takeover. We now have that data in HIBP, and it would be technically feasible to provide this to domain owners so that they can get an early heads up on which of their customers they probably have to rotate credentials for. I love the idea as it's a great preventative measure, perhaps that will be our next experiment.

Onto the passwords and as mentioned earlier, these have all been extracted and added to the existing Pwned Passwords service. This service remains totally free and open source (both code and data), has a really cool anonymity model allowing you to hit the API without disclosing the password being searched for, and has become absolutely MASSIVE!

I thought that doing more than 10 billion requests a month was cool, but look at that data transfer - more than a quarter of a petabyte just last month! And it's in use at some pretty big name sites as well:

That's just where the API is implemented client-side, and we can identify the source of the requests via the referrer header. Most implementations are done server-side, and by design, we have absolutely no idea who those folks are. Shoutout to Cloudflare while we're here for continuing to provide the service behind this for free to help make a more secure web.

In terms of the passwords in this latest stealer log corpus, we found 167 million unique ones of which only 61 million were already in HIBP. That's a massive number, so we did some checks, and whilst there's always a bit of junk in these data sets (remember - criminals and formatting!) there's also a heap of new stuff. For example:

Tryingtogetkangaroo
Kangaroolover69
fuckkangaroos

And about 106M other non-kangaroo themed passwords. Admittedly, we did start to get a bit preoccupied looking at some of the creative ways people were creating previously unseen passwords:

passwordtoavoidpwned13
verygoodpassword
AVerryGoodPasswordThatNooneCanGuess2.0

And here's something especially ironic: check out these stealer log entries:

People have been checking these passwords on HIBP's service whilst infected with malware that logged the search! None of those passwords were in HIBP... but they all are now 🙂

Want to see something equally ironic? People using my Hack Yourself First website to learn about secure coding practices have also been infected with malware and ended up in stealer logs:

So, that's the experiment we're trying with stealer logs, and that's how to see the websites exposed against an email address. Just one final comment as it comes up every single time we load data like this:

We cannot manually provide data on a per-individual basis.

Hopefully, there's less need to now given the new feature outlined above, and I hope the massive burden of looking up individual records when there are 71 million people impacted is evident. Do leave your comments below and help us improve this feature to become as useful as we can possibly make it.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
January 13^th 2025 at 19:48

Troy Hunt
Weekly Update 434
January 12^th 2025 at 22:59

Weekly Update 434

By: Troy Hunt

This week I'm giving a little teaser as to what's coming with stealer logs in HIBP and in about 24 hours from the time of writing, you'll be able to see the whole thing in action. This has been a huge amount of work trawling through vast volumes of data and trying to make it usable by the masses, but I think what we're launchung tomorrow will be awesome. Along with a new feature around these stealer logs, we've also added a huge number of new passwords to Pwned Passwords not previously seen before. Now, for the first time ever, "fuckkangaroos" will be flagged by any websites using the service 😮 More awesome examples coming in tomorrow's blog post, stay tuned!

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Publicly asking for a security contact ios really not something I want to be doing (it tends to be a last resort after not being able to raise the company via various other channels)
Massive kudos to Synology for making the DiskStation rollover process entirely seamless (little bit of work restoring Plex, but at least there was zero data loss)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
January 12^th 2025 at 22:59

Troy Hunt
Weekly Update 433
January 6^th 2025 at 05:20

Weekly Update 433

By: Troy Hunt

It sounds easy - "just verify people's age before they access the service" - but whether we're talking about porn in the US or Australia's incoming social media laws, the reality is way more complex than that. There's no unified approach across jurisdictions and even within a single country like Australia, the closest we've got to that is a government scheme usually intended for accessing public services. And even if there was a technically workable model, who wants to get either the gov or some big tech firm involved in their use of Instagram or Pornhub?! There's a social acceptance to be considered and not only that, circumvention of age controls is very easy when you can simply VPN into another jurisdiction and access the same website blocked in your locale. Or in the case of the adult material, I'm told (🤷‍♂️) there are many other legally operating websites in other parts of the world that are less inclined to block individuals in specific states from foreign countries. There'll be no easy solutions for this one, but it'll make for an entertaining year 😊

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
My trusty Synology DS1512+ finally died after 12 years of faithful service (since recording this video, the new DS923+ arrived and migration was super smooth)
Pornhub addressed the age verification mandate from a bunch of US states by simply... blocking them (I wonder if there's a way around that...)
Proton VPN has seen a "massive surge" in VPN signups from the US (...there we go 🙂)
The EFF reckons there is no effective age verification method (they also downplay the negative impacts of social media on kids, which I disagree with)
The Glamira data breach made it into HIBP (link through to a Reddit thread where the company acknowledged the breach last year, no word on whether they disclosed to impacted individuals)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
January 6^th 2025 at 05:20

Troy Hunt
Weekly Update 432
December 30^th 2024 at 21:52

Weekly Update 432

By: Troy Hunt

There's a certain irony to the Bluesky situation where people are pushing back when I include links to X. Now, where have we seen this sort of behaviour before? 🤔 When I'm relying on content that only appears on that platform to add context to a data breach in HIBP and that content is freely accessible from within the native Bluesky app (without needing an X account), we're out of reasonable excuses for the negativity. And if "because Elon" is the sole reason and someone is firm enough in their convictions on that, there's a very easy solution 🙂

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
We're rebuilding the front-end of Have I Been Pwned (there's a lot of opinions on that thread!)
People on Bluesky are complaining about posting links to content that only exist on X (not exactly the right way to encourage use of other platforms)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
December 30^th 2024 at 21:52

Troy Hunt
Weekly Update 431
December 25^th 2024 at 09:02

Weekly Update 431

By: Troy Hunt

I fell waaay behind the normal video cadence this week, and I couldn't care less 😊 I mean c'mon, would you rather be working or sitting here looking at this view after snowboarding through Christmas?!

Christmas Day awesomeness in Norway 🇳🇴 Have a great one friends, wherever you are 🧑‍🎄 pic.twitter.com/F2FtcJYzRC
— Troy Hunt (@troyhunt) December 25, 2024

That said, Scott and I did carve out some time to chat about the, uh, "colourful" feedback he's had after finally putting a price on some Report URI features he'd been giving away free for years. And there's more data breaches, of course, including a couple I loaded over the previous week that I think were particularly interesting. Enjoy this week's video, next week's will be a 2024 wrap-up from somewhere much, much sunnier 😎

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
After many years, Scott put a price on the free tier of Report URI (and some of the feedback he got 😲)
I couldn't raise Young Living Essential Oils about their data breach (and their data is spread all over a popular clear web hacking forum too)
The "French Citizens" data breach had Millions of French people in it... (...and a lot of other people too)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
December 25^th 2024 at 09:02

Troy Hunt
Weekly Update 430
December 15^th 2024 at 11:38

Weekly Update 430

By: Troy Hunt

I'm back in Oslo! Writing this the day after recording, it feels like I couldn't be further from Dubai; the temperature starts with a minus, it's snowing and there's not a supercar in sight.

Back on business, this week I'm talking about the challenge of loading breaches and managing costs. A breach load immediately takes us from a very high percentage cache hit ratio on Cloudflare to zero. Consequently, our SQL costs skyrocket as the DB scales to support the load. Approximately 28 hours after loading the two breaches I mention in this week's update, we're still running a DB scale that's 350% larger than once we have a high cache hit ratio, and that directly hits my wallet. We need to work on this more because as I say in the video, I really don't like financial incentives that influence how breaches are handled, such as delaying them and bulking them together to reduce the impact of cache flush events like this. We'll give that more thought, I think there are a few ways to tackle this. For now, here's this week's video and some of the challenges we're facing:

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Some people really don't like supercars (although I suspect it's more about not liking to see either the enjoyment others take in them or the success they may have achieved)
Being online means having constant attacks against your online things (but failed login attempts against my son's and my Microsoft accounts are just that - failed attempts)
The German electricity provider Tibber had 50k records breached (a little one, but newsworthy enough to have hit the media)
And the first-ever Senegalese data breach went into HIBP courtesy of Yonéma (not exactly a high cross-over with our usual subscribers, but a breach is still a breach)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
December 15^th 2024 at 11:38

Troy Hunt
Weekly Update 429
December 8^th 2024 at 04:09

Weekly Update 429

By: Troy Hunt

A super quick intro today as I rush off to do the next very Dubai thing: drive a Lambo through the desert to go dirt bike riding before jumping in a Can-Am off-roader and then heading to the kart track for a couple of afternoon sessions. I post lots of pics to my Facebook account, and if none of that is interesting, here's this week's video on more infosec-related topics:

References

Sponsored by: Cyberattacks are guaranteed. Is your recovery? Protect your data in the cloud. Join Rubrik’s Cloud Resilience Summit.
The Armenian Government is now the 37th to have free and open access to their domains on HIBP (this gives them API-level domain searches to their gov TLD)
After two and a bit years on sale, we're now giving away "Pwned" the book, for free (go grab it in PDF or EPUB format)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
December 8^th 2024 at 04:09

Troy Hunt
"Pwned", The Book, Is Now Available for Free
December 6^th 2024 at 13:40

"Pwned", The Book, Is Now Available for Free

By: Troy Hunt

Nearly four years ago now, I set out to write a book with Charlotte and RobIt was the stories behind the stories, the things that drove me to write my most important blog posts, and then the things that happened afterwards. It's almost like a collection of meta posts, each one adding behind-the-scenes commentary that most people reading my material didn't know about at the time.

It was a strange time for all of us back then. I didn't leave the country for the first time in over a decade. I barely even left the state. I had time to toil on the passion project that became this book. As I wrote about years later, there were also other things occupying my mind at the time. Writing this book was cathartic, providing me the opportunity to express some of the emotions I was feeling at the time and to reflect on life.

Speaking of reflecting, this week was Have I Been Pwned's 11th birthday. Reaching this milestone, getting back to travel (I'm writing this poolside with a beer at a beautiful hotel in Dubai), life settling down (while sitting next to my amazing wife), and it now being 2 years since we launched the book, I decided we should just give it away for free. I mean really free, not "give me all your personal details, then here's a download link" I mean, here are the direct download links:

PDF
EPUB

I hope you enjoy the book. It's the culmination of so many things I worked so hard to create over the preceding decade and a half, and I'm really happy to just be giving it away now. Enjoy the book 😊

🏷️ My labels
- ❌
Article tags
December 6^th 2024 at 13:40

Troy Hunt
Welcoming the Armenian Government to Have I Been Pwned
December 4^th 2024 at 05:55

Welcoming the Armenian Government to Have I Been Pwned

By: Troy Hunt

Today, we're happy to welcome the 37th government to have full and free access to domain searches of their gov domains in Have I Been Pwned, Armenia. Armenia's National Computer Incident Response Team AM-CERT now joins three dozen other national counterparts in gaining visibility into how data breaches impact their national interests.

As we expand the reach of governments and organisations into HIBP, we hope to give defenders better insights into the impact of data breaches on their people so that the impact and value to attackers diminish.

🏷️ My labels
- ❌
Article tags
- ❌
- Government
December 4^th 2024 at 05:55

Troy Hunt
Weekly Update 428
December 1^st 2024 at 03:19

Weekly Update 428

By: Troy Hunt

I wouldn't say this is a list of my favourite breaches from this year as that's a bit of a disingenuous term, but oh boy were there some memorable ones. So many of the incidents I deal with are relatively benign in terms of either the data they expose or the nature of the service, but some of them this year were absolute zingers. This week, I'm talking about the ones that really stuck out to me for one reason or another, here's the top 5:

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
The Spoutible breach was one of the most bizarre instances of returning unnecessary data via an API I've ever seen (passwords, 2FA secrets and the code used in "magic links" to reset passwords)
It's one thing for spyware to be used for stalking partners against their terms and conditions, it was quite another for pcTattletale to explicitly refer to marital infidelity as a use case for the product (this data breach actually killed the company)
The "Combolists Posted to Telegram" breach was more significant for the stealer logs than it was the combolists aggregated from other sources (that really brought this class of breach into the spotlight for me)
The National Public Data breach was much more significant for the exposure of hundreds of millions of social security numbers than it was for the email addresses that went into HIBP (that's another company that folded as a result of their breach)
The Muah.AI breach exposed a trove of requests by users to create CSAM images (the linked thread is a mind-boggling series of tweets about both the content and the justifications offered for not having controls on the images created)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
December 1^st 2024 at 03:19

Troy Hunt
Weekly Update 427
November 25^th 2024 at 07:53

Weekly Update 427

By: Troy Hunt

I was going to write about how much I've enjoyed "tinkering" with the HIBP API, but somehow, that term doesn't really seem appropriate any more for a service of this scale. On the contrary, we're putting in huge amounts of effort to get this thing fast, stable, and sustainable. We could do the first two very easily just by throwing money at the cloud, but that makes the last one a bit hard. Besides, both Stefán and I do enjoy the challenge of optimising an increasingly large system to run on a shoestring and even though the days of "a coffee a day of running costs" are well behind us, arguably the cost per request (or some other usage-based metric) is better than ever. I hope you enjoy this chat between the two of us and as I say in the video, do please chime in with your thoughts and suggestions.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Read all the nitty-gritty about how we're getting "closer to the edge" (Stefán will follow this up with a more techie one on the SQL scaling side of things)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
November 25^th 2024 at 07:53

Troy Hunt
Closer to the Edge: Hyperscaling Have I Been Pwned with Cloudflare Workers and Caching
November 21^st 2024 at 07:35

Closer to the Edge: Hyperscaling Have I Been Pwned with Cloudflare Workers and Caching

By: Troy Hunt

I've spent more than a decade now writing about how to make Have I Been Pwned (HIBP) fast. Really fast. Fast to the extent that sometimes, it was even too fast:

The response from each search was coming back so quickly that the user wasn’t sure if it was legitimately checking subsequent addresses they entered or if there was a glitch.

Over the years, the service has evolved to use emerging new techniques to not just make things fast, but make them scale more under load, increase availability and sometimes, even drive down cost. For example, 8 years ago now I started rolling the most important services to Azure Functions, "serverless" code that was no longer bound by logical machines and would just scale out to whatever volume of requests was thrown at it. And just last year, I turned on Cloudflare cache reserve to ensure that all cachable objects remained cached, even under conditions where they previously would have been evicted.

And now, the pièce de résistance, the coolest performance thing we've done to date (and it is now "we", thank you Stefán): just caching the whole lot at Cloudflare. Everything. Every search you do... almost. Let me explain, firstly by way of some background:

When you hit any of the services on HIBP, the first place the traffic goes from your browser is to one of Cloudflare's 330 "edge nodes":

As I sit here writing this on the Gold Coast on Australia's most eastern seaboard, any request I make to HIBP hits that edge node on the far right of the Aussie continent which is just up the road in Brisbane. The capital city of our great state of Queensland is just a short jet ski away, about 80km as the crow flies. Before now, every single time I searched HIBP from home, my request bytes would travel up the wire to Brisbane and then take a giant 12,000km trip to Seattle where the Azure Function in the West US Azure data would query the database before sending the response 12,000km back west to Cloudflare's edge node, then the final 80km down to my Surfers Paradise home. But what if it didn't have to be that way? What if that data was already sitting on the Cloudflare edge node in Brisbane? And the one in Paris, and the one in well, I'm not even sure where all those blue dots are, but what if it was everywhere? Several awesome things would happen:

You'd get your response much faster as we've just shaved off more than 99% of the distance the bytes need to travel.
The availability would massively improve as there are far fewer nodes for the traffic to traverse through, plus when a response is cached, we're no longer dependent on the Azure Function or underlying storage mechanism.
We'd save on Azure Function execution costs, storage account hits and especially egress bandwidth (which is very expensive).

In short, pushing data and processing "closer to the edge" benefits both our customers and ourselves. But how do you do that for 5 billion unique email addresses? (Note: As of today, HIBP reports over 14 billion breached accounts, the number of unique email addresses is lower as on average, each breached address has appeared in multiple breaches.) To answer this question, let's recap on how the data is queried:

Via the front page of the website. This hits a "unified search" API which accepts an email address and uses Cloudflare's Turnstile to prohibit automated requests not originating from the browser.
Via the public API. This endpoint also takes an email address as input and then returns all breaches it appears in.
Via the k-anonyity enterprise API. This endpoint is used by a handful of large subscribers such as Mozilla and 1Password. Instead of searching by email address, it implements k-anonymity and searches by hash prefix.

Let's delve into that last point further because it's the secret sauce to how this whole caching model works. In order to provide subscribers of this service with complete anonymity over the email addresses being searched for, the only data passed to the API is the first six characters of the SHA-1 hash of the full email address. If this sounds odd, read the blog post linked to in that last bullet point for full details. The important thing for now, though, is that it means there are a total of 16^6 different possible requests that can be made to the API, which is just over 16 million. Further, we can transform the first two use cases above into k-anonymity searches on the server side as it simply involved hashing the email address and taking those first six characters.

In summary, this means we can boil the entire searchable database of email addresses down to the following:

AAAAAA
AAAAAB
AAAAAC
...about 16 million other values...
FFFFFD
FFFFFE
FFFFFF

That's a large albeit finite list, and that's what we're now caching. So, here's what a search via email address looks like:

Address to search: test@example.com
Full SHA-1 hash: 567159D622FFBB50B11B0EFD307BE358624A26EE
Six char prefix: 567159
API endpoint: https://[host]/[path]/567159
If hash prefix is cached, retrieve result from there
If hash prefix is not cached, query origin and save to cache
Return result to client

K-anonymity searches obviously go straight to step four, skipping the first few steps as we already know the hash prefix. All of this happens in a Cloudflare worker, so it's "code on the edge" creating hashes, checking cache then retrieving from the origin where necessary. That code also takes care of handling parameters that transform queries, for example, filtering by domain or truncating the response. It's a beautiful, simple model that's all self-contained within a worker and a very simple origin API. But there's a catch - what happens when the data changes?

There are two events that can change cached data, one is simple and one is major:

Someone opts out of public searchability and their email address needs to be removed. That's easy, we just call an API at Cloudflare and flush a single hash prefix.
A new data breach is loaded and there are changes to a large number of hash prefixes. In this scenario, we flush the entire cache and start populating it again from scratch.

The second point is kind of frustrating as we've built up this beautiful collection of data all sitting close to the consumer where it's super fast to query, and then we nuke it all and go from scratch. The problem is it's either that or we selectively purge what could be many millions of individual hash prefixes, which you can't do:

For Zones on Enterprise plan, you may purge up to 500 URLs in one API call.

And:

Cache-Tag, host, and prefix purging each have a rate limit of 30,000 purge API calls in every 24 hour period.

We're giving all this further thought, but it's a non-trivial problem and a full cache flush is both easy and (near) instantaneous.

Enough words, let's get to some pictures! Here's a typical week of queries to the enterprise k-anonymity API:

This is a very predictable pattern, largely due to one particular subscriber regularly querying their entire customer base each day. (Sidenote: most of our enterprise level subscribers use callbacks such that we push updates to them via webhook when a new breach impacts their customers.) That's the total volume of inbound requests, but the really interesting bit is the requests that hit the origin (blue) versus those served directly by Cloudflare (orange):

Let's take the lowest blue data point towards the end of the graph as an example:

At that time, 96% of requests were served from Cloudflare's edge. Awesome! But look at it only a little bit later:

That's when I flushed cache for the Finsure breach, and 100% of traffic started being directed to the origin. (We're still seeing 14.24k hits via Cloudflare as, inevitably, some requests in that 1-hour block were to the same hash range and were served from cache.) It then took a whole 20 hours for the cache to repopulate to the extent that the hit:miss ratio returned to about 50:50:

Look back towards the start of the graph and you can see the same pattern from when I loaded the DemandScience breach. This all does pretty funky things to our origin API:

That last sudden increase is more than a 30x traffic increase in an instant! If we hadn't been careful about how we managed the origin infrastructure, we would have built a literal DDoS machine. Stefán will write later about how we manage the underlying database to ensure this doesn't happen, but even still, whilst we're dealing with the cyclical support patterns seen in that first graph above, I know that the best time to load a breach is later in the Aussie afternoon when the traffic is a third of what it is first thing in the morning. This helps smooth out the rate of requests to the origin such that by the time the traffic is ramping up, more of the content can be returned directly from Cloudflare. You can see that in the graphs above; that big peaky block towards the end of the last graph is pretty steady, even though the inbound traffic the first graph over the same period of time increases quite significantly. It's like we're trying to race the increasing inbound traffic by building ourselves up a bugger in cache.

Here's another angle to this whole thing: now more than ever, loading a data breach costs us money. For example, by the end of the graphs above, we were cruising along at a 50% cache hit ratio, which meant we were only paying for half as many of the Azure Function executions, egress bandwidth, and underlying SQL database as we would have been otherwise. Flushing cache and suddenly sending all the traffic to the origin doubles our cost. Waiting until we're back at 90% cache it ratio literally increases those costs 10x when we flush. If I were to be completely financially ruthless about it, I would need to either load fewer breaches or bulk them together such that a cache flush is only ejecting a small amount of data anyway, but clearly, that's not what I've been doing 😄

There's just one remaining fly in the ointment...

Of those three methods of querying email addresses, the first is a no-brainer: searches from the front page of the website hit a Cloudflare Worker where it validates the Turnstile token and returns a result. Easy. However, the second two models (the public and enterprise APIs) have the added burden of validating the API key against Azure API Management (APIM), and the only place that exists is in the West US origin service. What this means for those endpoints is that before we can return search results from a location that may be just a short jet ski ride away, we need to go all the way to the other side of the world to validate the key and ensure the request is within the rate limit. We do this in the lightest possible way with barely any data transiting the request to check the key, plus we do it in async with pulling the data back from the origin service if it isn't already in cache. In other words, we're as efficient as humanly possible, but we still cop a massive latency burden.

Doing API management at the origin is super frustrating, but there are really only two alternatives. The first is to distribute our APIM instance to other Azure data centres, and the problem with that is we need a Premium instance of the product. We presently run on a Basic instance, which means we're talking about a 19x increase in price just to unlock that ability. But that's just to go Premium; we then need at least one more instance somewhere else for this to make sense, which means we're talking about a 28x increase. And every region we add amplifies that even further. It's a financial non-starter.

The second option is for Cloudflare to build an API management product. This is the killer piece of this puzzle, as it would put all the checks and balances within the one edge node. It's a suggestion I've put forward on many occasions now, and who knows, maybe it's already in the works, but it's a suggestion I make out of a love of what the company does and a desire to go all-in on having them control the flow of our traffic. I did get a suggestion this week about rolling what is effectively a "poor man's API management" within workers, and it's a really cool suggestion, but it gets hard when people change plans or when we want to apply quotas to APIs rather than rate limits. So c'mon Cloudflare, let's make this happen!

Finally, just one more stat on how powerful serving content directly from the edge is: I shared this stat last month for Pwned Passwords which serves well over 99% of requests from Cloudflare's cache reserve:

There it is - we’ve now passed 10,000,000,000 requests to Pwned Password in 30 days 😮 This is made possible with @Cloudflare’s support, massively edge caching the data to make it super fast and highly available for everyone. pic.twitter.com/kw3C9gsHmB
— Troy Hunt (@troyhunt) October 5, 2024

That's about 3,900 requests per second, on average, non-stop for 30 days. It's obviously way more than that at peak; just a quick glance through the last month and it looks like about 17k requests per second in a one-minute period a few weeks ago:

But it doesn't matter how high it is, because I never even think about it. I set up the worker, I turned on cache reserve, and that's it 😎

I hope you've enjoyed this post, Stefán and I will be doing a live stream on this topic at 06:00 AEST Friday morning for this week's regular video update, and it'll be available for replay immediately after. It's also embedded here for convenience:

🏷️ My labels
- ❌
Article tags
November 21^st 2024 at 07:35

Troy Hunt
Weekly Update 426
November 17^th 2024 at 02:39

Weekly Update 426

By: Troy Hunt

I have absolutely no problem at all talking about the code I've screwed up. Perhaps that's partly because after 3 decades of writing software (and doing some meaningful stuff along the way), I'm not particularly concerned about showing my weaknesses. And this week, I screwed up a bunch of stuff; database queries that weren't resilient to SQL database scale changes, partially completed breach notifications I didn't notice until it was too late to easily fix, and some queries that performed so badly they crashed the entire breach notification process after loading the massive DemandScience incident. Fortunately, none of them had any impact of note, we fixed them all and re-ran processes, and now we're more resilient than ever 😄

Oh - and if you like this style of content, this coming Friday, Stefan and I will do a joint live stream on all sorts of other bits about how now HIBP runs.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
Elon Musk is right (I hate cookie warnings, but I'm entertained by people losing their minds "because Elon")
The Hot Topic breach went into HIBP (that's another 57M email addresses right there)
There are also now 122M more records in HIBP courtesy of the DemandScience breach (it's publicly aggregated data, but it's still a breach)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
November 17^th 2024 at 02:39

Troy Hunt
Inside the DemandScience by Pure Incubation Data Breach
November 13^th 2024 at 09:59

Inside the DemandScience by Pure Incubation Data Breach

By: Troy Hunt

Apparently, before a child reaches the age of 13, advertisers will have gathered more 72 million data points on them. I knew I'd seen a metric about this sometime recently, so I went looking for "7,000", which perfectly illustrates how unaware we are of the extent of data collection on all of us. I started Have I Been Pwned (HIBP) in the first place because I was surprised at where my data had turned up in breaches. 11 years and 14 billion breached records later, I'm still surprised!

Jason (not his real name) was also recently surprised at where his data had appeared. He found it in a breach of a service called "Pure Incubation", a company whose records had appeared on a popular hacking forum earlier this year:

#DataLeak Alert ⚠️⚠️⚠️

🚨Over 183 Million Pure Incubation Ventures Records for Sale 🚨

183,754,481 records belonging to Pure Incubation Ventures (https://t.co/m3sjzAMlXN) have been put up for sale on a hacking forum for $6,000 negotiable.

Additionally, the threat actor with… pic.twitter.com/tqsyb8plPG
— HackManac (@H4ckManac) February 28, 2024

When Jason found his email address and other info in this corpus, he had the same question so many others do when their data turns up in a place they've never heard of before - how? Why?! So, he asked them:

I seem to have found my email in your data breach. I am interested in finding how my information ended up in your database.

To their credit, he got a very comprehensive answer, which I've included below:

Well, that answers the "how" part of the equation; they've aggregated data from public sources. And the "why" part? It's the old "data is the new oil" analogy that recognises how valuable our info is, and as such, there's a market for it. There are lots of terms used to describe what DemandScience does, including "B2B demand generation", "buyer intelligence solutions provider", "empowering technology companies to accelerate ROI", "supercharging pipelines" and "account intelligence". Or, to put it in a more lay-person-friendly fashion, they sell data on people.

DemandScience is what we refer to as a "data aggregator" in that they combine identity data from multiple locations, bundle it up, and then sell it. Occasionally, data aggregators end up having sizeable data breaches; before today, HIBP already contained Adapt (9M records), Data & Leads (44M records), Exactis (132M records), Factual (2M records), and You've Been Scraped (66M records). According to DemandScience, "none of our current operational systems were exploited", yet simultaneously, "the leaked data originated from a system that has been decommissioned". So, it's a breach of an old system.

Does it matter? I mean, if it's just public data, should people care? Jason cared, at least enough to make the original enquiry and for DemandScience to look him up and realise he's not in their current database. Still, he existed in the breached one (I later sent Jason his record from the breach, and he confirmed the accuracy). As I often do in these cases, I reached out to a bunch of recent HIBP subscribers in the breach and asked them three simple questions:

Is the data about you accurate and if not, which bits are wrong?
Is this data you would consider to be in the public domain already?
Would you expect to be notified about your data being used in this fashion, and consequently appearing a breach?

The answers were all the same: the data is accurate, it's already in the public domain, and people aren't too concerned about it appearing in this breach. Well that was easy 🙂 However...

There are two nuances that aren't captured here, and the first one is that this is valuable data, that's why DemandScience sells it! It comes back to that "new oil" analogy and if you have enough of it, you can charge good money for it. Companies typically use data such as this to do precisely the sort of catchphrasey stuff the company talks about, primarily around maximising revenue from their customers by understanding them better.

The second nuance is that whilst this data may already be in the public domain, did the owners of it expect it to be used in this fashion? For example, if you publish your details in a business directory, is your expectation that this info may then be sold to other companies to help them upsell you on their products? Probably not. And if, like many of the records in the data, someone's row is accompanied by their LinkedIn profile, would they expect that data to matched and sold? I suggest the responses would likely be split here, and that in itself is an important observation: how we view the sensitivity of our data and the impact of it being exposed (whether personal or business) is extremely personal. Some people take the view of "I have nothing to hide", whilst others become irate if even just their email address is exposed.

Whilst considering how to add more insights to this blog post, I thought I'd do a quick check on just one more email address:

"54543060",,"0","TROY","HUNT","PO BOX 57",,"WEST RYDE",,,"AU","61298503333",,,,"troy.hunt@pfizer.com","pfizer.com","PFIZER INC",,"250-499","$50 - 99 Million","Healthcare, Pharmaceuticals and Biotech","VICE PRESIDENT OF INFORMATION TECHNOLOGY","VP Level","2834",,"Senior Management (SVP/GM/Director)","IT",,"1","GemsTarget INTL","GEMSTARGET_INTL_648K_10.17.18",,,,,,,,,"18/10/2018 05:12:39","5/10/2021 16:47:56","PFIZER.COM",,,,,"IT Management General","Information Technology"

I'll be entirely transparent and honest here - my exact words after finding this were "motherfucker!" True story, told uncensored here because I want to impress on the audience how I feel when my data turns up somewhere publicly. And I do feel like it's "my" data; it's certainly my name and even though it's my old Pfizer email address I've not used for almost a decade now, that also has my name in it. My job title is also there... and it's completely wrong! I never had a VP-level role, even though the other data around my tech role is at least in the vicinity of being correct. But other than the initial shock of finding myself in yet another data breach, personally, I'm in the same boat as the HIBP subscribers I contacted, and this doesn't bother me too much. But I also agree with the following responses I received to my third question:

I think it is useful to be notified of such breaches, even if it is just to confirm no sensitive data has been compromised. As I said, our IT department recently notified me that some of my data was leaked and a pre-emptive password reset was enforced as they didn't know what was leaked.

It would be good to see it as an informational notification in case there's an increase in attack attempts against my email address.

I would like to opt-out of here to reduce the SPAM and Phishing emails.

That last one seems perfectly reasonable, and fortunately, DemandScience does have a link on their website to Do Not Sell My Information:

Dammit! If, like me, you're part of the 99.5% of the world that doesn't live in California, then apparently this form isn't for you. However, they do list dataprivacy@demandscience.com on that page, which is the same address Jason was communicating with above. Chances are, if you want to remove your data then that's where to start.

There were almost 122M unique email addresses in this corpus and those have now been added to HIBP. Treat this as informational; I suspect that for most people, it won't bother them, whilst others will ask for their data not to be sold (regardless of where they live in the world). But in all likelihood, there will be more than a handful of domain subscribers who take issue with that volume of people data sitting there in one corpus easily downloadable via a clear web hacking forum. For example, mine was just one of many tens of thousands of Pfizer email addresses, and that sort of thing is going to raise the ire of some folks in corporate infosec capacities.

One last comment: there was a story published earlier this year titled Our Investigation of the Pure Incubation Ventures Leak and in there they refer to "encrypted passwords" being present in the data. Many of the files do contain a column with bcrypt hashes (which is definitely not encryption), but given the way in which this data was collated, I can see no evidence whatsoever that these are password hashes. As such, I haven't listed "Passwords" as one of the compromised data classes in HIBP and you find yourself in this breach, I wouldn't be at all worried about this.

🏷️ My labels
- ❌
Article tags
- ❌
- Have I Been Pwned
November 13^th 2024 at 09:59

Troy Hunt
Weekly Update 425
November 9^th 2024 at 07:15

Weekly Update 425

By: Troy Hunt

This was a much longer than usual update, largely due to the amount of time spent discussing the Earth 2 incident. As I said in the video (many times!), the amount of attention this has garnered from both Earth 2 users and the company itself is incommensurate with the impact of the incident itself. It's a nothing-burger. Email addresses and usernames, that's it, and of course, their association with the service, which may lead to some very targeted spam or phishing attempts. It's still a breach by any reasonable definition of the term, but it should have been succinctly summarised and disclosed to impacted parties with everyone moving on with more important things in life a few moments later. And that's exactly what I'm going to do right now 😊

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Speaking of giving a nothing-burger incident more attention than it deserves, the Earth 2 Twitter screed hasn't done them any favours (something something Streisand effect)
Data breach disclosure 101: How to succeed after you've failed (7 years on, this is still the guidance I give breached orgs)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
November 9^th 2024 at 07:15

Troy Hunt
Weekly Update 424
November 3^rd 2024 at 07:33

Weekly Update 424

By: Troy Hunt

I have really clear memories of listening to the Stack Overflow podcast in the late 2000's and hearing Jeff and Joel talk about the various challenges they were facing and the things they did to overcome them. I just suddenly thought of that when realising how long this week's video went for with no real plan other than to talk about our HIBP backlog. People seem to love this in the same way I loved listening to the guys a decade and a half ago. I'll do one of these with Stefan as well over the course of this month, let us know what you'd like to hear about 😊

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
November 3^rd 2024 at 07:33

Troy Hunt
Weekly Update 423
October 26^th 2024 at 22:41

Weekly Update 423

By: Troy Hunt

Firstly, my apologies for the minute and a bit of echo at the start of this video, OBS had somehow magically decided to start recording both the primary mic and the one built into my camera. Easy fix, moving on...

During the livestream, I was perplexed as to why the HIBP DB was suddenly maxing out. Turns out that this aligned with dropping a constraint on the table of domains which appears to have caused the table to reindex and massively slow down the queries for breached email addresses. Further, we simultaneously started having problems related to MAXDOP (the maximum degree of parallelism for the stored procedure running the query), which was only resolved after we forced it to not run on multiple CPUs by setting it to 1 (weirdly, 2 is also fine but 3 or higher completely killed perf). Fun times, running a service like this.

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
The Internet Archive's Zendesk was accessed and replies sent to a bunch of tickets (it's just gone from bad to bad for them, and still no disclosure to individuals...)
Basically everyone thinks unauthorised access should result in breach notifications being sent to impact individuals (I mean, it's a predictable outcome, but there were still some wacky arguments against it)
I'm feeling pretty damn exasperated about the lack of breach disclosure lately (multiple incidents this year have included my own personal data, and I'm pissed)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
October 26^th 2024 at 22:41

Troy Hunt
Weekly Update 422
October 21^st 2024 at 02:43

Weekly Update 422

By: Troy Hunt

Apparently, Stefan and I trying to work stuff out in real time about how to build more efficient features in HIBP is entertaining watching! If I was to guess, I think it's just seeing people work through the logic of how things work and how we might be able to approach things differently, and doing it in real time very candidly. I'm totally happy doing that, and the comments from the audience did give us more good food for thought too. I'll try and line up a session just like that before the end of the year, we've certainly got no shortage of material!

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
If you read the BBC, I hacked Internet Archive (this was followed by much apologising, but it was still pretty damn sloppy writing)
Muah.AI and their users continue to push back against controls to limit child abuse requests (and when they talk about implementing controls, the users get upset)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
October 21^st 2024 at 02:43

Troy Hunt
Weekly Update 421
October 14^th 2024 at 00:17

Weekly Update 421

By: Troy Hunt

It wasn't easy talking about the Muah.AI data breach. It's not just the rampant child sexual abuse material throughout the system (or at least requests for the AI to generate images of it), it's the reactions of people to it. The tweets justifying it on the basis of there being noo "actual" abuse, the characterisation of this being akin to "merely thoughts in someone's head", and following my recording of this video, the backlash from their users about any attempts to curb creating sexual image of young children being "too much":

Which is making customers unhappy - "any censorship is too much": pic.twitter.com/fzfrFdKL8w
— Troy Hunt (@troyhunt) October 12, 2024

The law will catch up with this (and anyone in that breach creating this sort of material should be feel very bloody nervous right now), and the writing is already on the wall for people generating CSAM via AI:

This bill would expand the scope of certain of these provisions to include matter that is digitally altered or generated by the use of artificial intelligence, as such matter is defined.

The bill can't pass soon enough.

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
The Muah.AI data breach revealed an enormous volume of requests for CSAM material (you can hear me struggling to even properly explain this, it's just hard to find the words)
Internet Archive was breached, defaced and DDoS'd (4 days on from that tweet thread, they're still offline)
National Public Data - the service that siphoned up hundreds of millions of social security numbers then exposed them all in a breach - is dead (now, how many more of these are left?)

🏷️ My labels
- ❌
Article tags
- ❌
- Weekly update
October 14^th 2024 at 00:17