Steve Rymell Head of Technology, Airbus CyberSecurity answers What Should Frighten us about AI-Based Malware?

 

Of all the cybersecurity industry’s problems, one of the most striking is the way attackers are often able to stay one step ahead of defenders without working terribly hard. It’s an issue whose root causes are mostly technical: the prime example are software vulnerabilities which cyber-criminals have a habit of finding out about before vendors and their customers, leading to the almost undefendable zero-day phenomenon which has propelled many famous cyber-attacks.

A second is that organizations struggling with the complexity of unfamiliar and new technologies make mistakes, inadvertently leaving vulnerable ports and services exposed. Starkest of all, perhaps, is the way techniques, tools, and infrastructure set up to help organizations defend themselves (Shodan, for example but also numerous pen-test tools) are now just as likely to be turned against businesses by attackers who tear into networks with the aggression of red teams gone rogue.

Add to this the polymorphic nature of modern malware, and attackers can appear so conceptually unstoppable that it’s no wonder security vendors increasingly emphasize the need not to block attacks but instead respond to them as quickly as possible.

The AI fightback
Some years back, a list of mostly US-based start-ups started a bit of a counter-attack against the doom and gloom with a brave new idea – AI machine learning (ML) security powered by algorithms. In an age of big data, this makes complete sense and the idea has since been taken up by all manner of systems used to for anti-spam, malware detection, threat analysis and intelligence, and Security Operations Centre (SoC) automation where it has been proposed to help patch skills shortages.

I’d rate these as useful advances, but there’s no getting away from the controversial nature of the theory, which has been branded by some as the ultimate example of technology as a ‘black box’ nobody really understands. How do we know that machine learning is able to detect new and unknown types of attack that conventional systems fail to spot? In some cases, it could be because the product brochure says so.

Then the even bigger gotcha hits you – what’s stopping attackers from outfoxing defensive ML with even better ML of their own? If this were possible, even some of the time, the industry would find itself back at square one.

This is pure speculation, of course, because to date nobody has detected AI being used in a cyber-attack, which is why our understanding of how it might work remains largely based around academic research such as IBM’s proof-of-concept DeepLocker malware project.

What might malicious ML look like?
It would be unwise to ignore the potential for trouble. One of the biggest hurdles faced by attackers is quickly understanding what works, for example when sending spam, phishing and, increasingly, political disinformation.

It’s not hard to imagine that big data techniques allied to ML could hugely improve the efficiency of these threats by analyzing how targets react to and share them in real time. This implies the possibility that such campaigns might one day evolve in a matter of hours or minutes; a timescale defender would struggle to counter using today’s technologies.

A second scenario is one that defenders would even see: that cyber-criminals might simulate the defenses of a target using their own ML to gauge the success of different attacks (a technique already routinely used to evade anti-virus). Once again, this exploits the advantage that attackers always have sight of the target, while defenders must rely on good guesses.

Or perhaps ML could simply be used to crank out vast quantities of new and unique malware than is possible today. Whichever of these approaches is taken – and this is only a sample of the possibilities – it jumps out at you how awkward it would be to defend against even relatively simple ML-based attacks. About the only consolation is that if ML-based AI really is a black box that nobody understands then, logically, the attackers won’t understand it either and will waste time experimenting.

Unintended consequences
If we should fear anything it’s precisely this black box effect. There are two parts to this, the biggest of which is the potential for ML-based malware to cause something unintended to happen, especially when targeting critical infrastructure.

This phenomenon has already come to pass with non-AI malware – Stuxnet in 2010 and NotPetya in 2017 are the obvious examples – both of which infected thousands of organizations not on their original target list after unexpectedly ‘escaping’ into the wild.

When it comes to powerful malware exploiting multiple zero days there’s no such thing as a reliably contained attack. Once released, this kind of malware remains pathogenically dangerous until every system it can infect is patched or taken offline, which might be years or decades down the line.

Another anxiety is that because the expertise to understand ML is still thin on the ground, there’s a danger that engineers could come to rely on it without fully understanding its limitations, both for defense and by over-estimating its usefulness in attack. The mistake, then, might be that too many over-invest in it based on marketing promises that end up consuming resources better deployed elsewhere.  Once a more realistic assessment takes hold, ML could end up as just another tool that is good at solving certain very specific problems.

Conclusion
My contradictory-sounding conclusion is that perhaps ML and AI makes no fundamental difference at all. It’s just another stop on a journey computer security has been making since the beginning of digital time. The problem is overcoming our preconceptions about what it is and what it means. Chiefly, we must overcome the tendency to think of ML and AI as mysteriously ‘other’ because we don’t understand it and therefore find it difficult to process the concept of machines making complex decisions.

It’s not as if attackers aren’t breaching networks already with today’s pre-ML technology or that well-prepared defenders aren’t regularly stopping them using the same technology. What AI reminds us is that the real difference is how organizations are defended, not whether they or their attackers use ML and AI or not. That has always been what separates survivors from victims. Cybersecurity remains a working demonstration of how the devil takes the hindmost.

Source: https://www.infosecurity-magazine.com/opinions/frighten-ai-malware-1/

Werbeanzeigen

Do you know who your iPhone is talking to?

 

https://www.washingtonpost.com/technology/2019/05/28/its-middle-night-do-you-know-who-your-iphone-is-talking/?noredirect=on

Yet these days, we spend more time in apps. Apple is strict about requiring apps to get permission to access certain parts of the iPhone, including your camera, microphone, location, health information, photos and contacts. (You can check and change those permissions under privacy settings.) But Apple turns more of a blind eye to what apps do with data we provide them or they generate about us — witness the sorts of tracking I found by looking under the covers for a few days.

“For the data and services that apps create on their own, our App Store Guidelines require developers to have clearly posted privacy policies and to ask users for permission to collect data before doing so. When we learn that apps have not followed our Guidelines in these areas, we either make apps change their practice or keep those apps from being on the store,” Apple says.

Yet very few apps I found using third-party trackers disclosed the names of those companies or how they protect my data. And what good is burying this information in privacy policies, anyway? What we need is accountability.

Getting more deeply involved in app data practices is complicated for Apple. Today’s technology frequently is built on third-party services, so Apple couldn’t simply ban all connections to outside servers. And some companies are so big they don’t even need the help of outsiders to track us.

The result shouldn’t be to increase Apple’s power. “I would like to make sure they’re not stifling innovation,” says Andrés Arrieta, the director of consumer privacy engineering at the Electronic Frontier Foundation. If Apple becomes the Internet’s privacy police, it could shut down rivals.

Jackson suggests Apple could also add controls into iOS like the ones built into Privacy Pro to give everyone more visibility.

Or perhaps Apple could require apps to label when they’re using third-party trackers. If I opened the DoorDash app and saw nine tracker notices, it might make me think twice about using it.

I don’t mind letting your trackers see my private data as long as I get something useful in exchange.

Forget privacy: you’re terrible at targeting anyway

I don’t mind letting your programs see my private data as long as I get something useful in exchange. But that’s not what happens.

A former co-worker told me once: „Everyone loves collecting data, but nobody loves analyzing it later.“ This claim is almost shocking, but people who have been involved in data collection and analysis have all seen it. It starts with a brilliant idea: we’ll collect information about every click someone makes on every page in our app! And we’ll track how long they hesitate over a particular choice! And how often they use the back button! How many seconds they watch our intro video before they abort! How many times they reshare our social media post!

And then they do track all that. Tracking it all is easy. Add some log events, dump them into a database, off we go.

But then what? Well, after that, we have to analyze it. And as someone who has analyzed a lot of data about various things, let me tell you: being a data analyst is difficult and mostly unrewarding (except financially).

See, the problem is there’s almost no way to know if you’re right. (It’s also not clear what the definition of „right“ is, which I’ll get to in a bit.) There are almost never any easy conclusions, just hard ones, and the hard ones are error prone. What analysts don’t talk about is how many incorrect charts (and therefore conclusions) get made on the way to making correct ones. Or ones we think are correct. A good chart is so incredibly persuasive that it almost doesn’t even matter if it’s right, as long as what you want is to persuade someone… which is probably why newpapers, magazines, and lobbyists publish so many misleading charts.

But let’s leave errors aside for the moment. Let’s assume, very unrealistically, that we as a profession are good at analyzing things. What then?

Well, then, let’s get rich on targeted ads and personalized recommendation algorithms. It’s what everyone else does!

Or do they?

The state of personalized recommendations is surprisingly terrible. At this point, the top recommendation is always a clickbait rage-creating article about movie stars or whatever Trump did or didn’t do in the last 6 hours. Or if not an article, then a video or documentary. That’s not what I want to read or to watch, but I sometimes get sucked in anyway, and then it’s recommendation apocalypse time, because the algorithm now thinks I like reading about Trump, and now everything is Trump. Never give positive feedback to an AI.

This is, by the way, the dirty secret of the machine learning movement: almost everything produced by ML could have been produced, more cheaply, using a very dumb heuristic you coded up by hand, because mostly the ML is trained by feeding it examples of what humans did while following a very dumb heuristic. There’s no magic here. If you use ML to teach a computer how to sort through resumes, it will recommend you interview people with male, white-sounding names, because it turns out that’s what your HR department already does. If you ask it what video a person like you wants to see next, it will recommend some political propaganda crap, because 50% of the time 90% of the people do watch that next, because they can’t help themselves, and that’s a pretty good success rate.

(Side note: there really are some excellent uses of ML out there, for things traditional algorithms are bad at, like image processing or winning at strategy games. That’s wonderful, but chances are good that your pet ML application is an expensive replacement for a dumb heuristic.)

Someone who works on web search once told me that they already have an algorithm that guarantees the maximum click-through rate for any web search: just return a page full of porn links. (Someone else said you can reverse this to make a porn detector: any link which has a high click-through rate, regardless of which query it’s answering, is probably porn.)

Now, the thing is, legitimate-seeming businesses can’t just give you porn links all the time, because that’s Not Safe For Work, so the job of most modern recommendation algorithms is to return the closest thing to porn that is still Safe For Work. In other words, celebrities (ideally attractive ones, or at least controversial ones), or politics, or both. They walk that line as closely as they can, because that’s the local maximum for their profitability. Sometimes they accidentally cross that line, and then have to apologize or pay a token fine, and then go back to what they were doing.

This makes me sad, but okay, it’s just math. And maybe human nature. And maybe capitalism. Whatever. I might not like it, but I understand it.

My complaint is that none of the above had anything to do with hoarding my personal information.

The hottest recommendations have nothing to do with me

Let’s be clear: the best targeted ads I will ever see are the ones I get from a search engine when it serves an ad for exactly the thing I was searching for. Everybody wins: I find what I wanted, the vendor helps me buy their thing, and the search engine gets paid for connecting us. I don’t know anybody who complains about this sort of ad. It’s a good ad.

And it, too, had nothing to do with my personal information!

Google was serving targeted search ads decades ago, before it ever occurred to them to ask me to log in. Even today you can still use every search engine web site without logging in. They all still serve ads targeted to your search keyword. It’s an excellent business.

There’s another kind of ad that works well on me. I play video games sometimes, and I use Steam, and sometimes I browse through games on Steam and star the ones I’m considering buying. Later, when those games go on sale, Steam emails me to tell me they are on sale, and sometimes then I buy them. Again, everybody wins: I got a game I wanted (at a discount!), the game maker gets paid, and Steam gets paid for connecting us. And I can disable the emails if I want, but I don’t want, because they are good ads.

But nobody had to profile me to make that happen! Steam has my account, and I told it what games I wanted and then it sold me those games. That’s not profiling, that’s just remembering a list that I explicitly handed to you.

Amazon shows a box that suggests I might want to re-buy certain kinds of consumable products that I’ve bought in the past. This is useful too, and requires no profiling other than remembering the transactions we’ve had with each other in the past, which they kinda have to do anyway. And again, everybody wins.

Now, Amazon also recommends products like the ones I’ve bought before, or looked at before. That’s, say, 20% useful. If I just bought a computer monitor, and you know I did because I bought it from you, then you might as well stop selling them to me. But for a few days after I buy any electronics they also keep offering to sell me USB cables, and they’re probably right. So okay, 20% useful targeting is better than 0% useful. I give Amazon some credit for building a useful profile of me, although it’s specifically a profile of stuff I did on their site and which they keep to themselves. That doesn’t seem too invasive. Nobody is surprised that Amazon remembers what I bought or browsed on their site.

Worse is when (non-Amazon) vendors get the idea that I might want something. (They get this idea because I visited their web site and looked at it.) So their advertising partner chases me around the web trying to sell me the same thing. They do that, even if I already bought it. Ironically, this is because of a half-hearted attempt to protect my privacy. The vendor doesn’t give information about me or my transactions to their advertising partner (because there’s an excellent chance it would land them in legal trouble eventually), so the advertising partner doesn’t know that I bought it. All they know (because of the advertising partner’s tracker gadget on the vendor’s web site) is that I looked at it, so they keep advertising it to me just in case.

But okay, now we’re starting to get somewhere interesting. The advertiser has a tracker that it places on multiple sites and tracks me around. So it doesn’t know what I bought, but it does know what I looked at, probably over a long period of time, across many sites.

Using this information, its painstakingly trained AI makes conclusions about which other things I might want to look at, based on…

…well, based on what? People similar to me? Things my Facebook friends like to look at? Some complicated matrix-driven formula humans can’t possibly comprehend, but which is 10% better?

Probably not. Probably what it does is infer my gender, age, income level, and marital status. After that, it sells me cars and gadgets if I’m a guy, and fashion if I’m a woman. Not because all guys like cars and gadgets, but because some very uncreative human got into the loop and said „please sell my car mostly to men“ and „please sell my fashion items mostly to women.“ Maybe the AI infers the wrong demographic information (I know Google has mine wrong) but it doesn’t really matter, because it’s usually mostly right, which is better than 0% right, and advertisers get some mostly demographically targeted ads, which is better than 0% targeted ads.

You know this is how it works, right? It has to be. You can infer it from how bad the ads are. Anyone can, in a few seconds, think of some stuff they really want to buy which The Algorithm has failed to offer them, all while Outbrain makes zillions of dollars sending links about car insurance to non-car-owning Manhattanites. It might as well be a 1990s late-night TV infomercial, where all they knew for sure about my demographic profile is that I was still awake.

You tracked me everywhere I go, logging it forever, begging for someone to steal your database, desperately fearing that some new EU privacy regulation might destroy your business… for this?

Statistical Astrology

Of course, it’s not really as simple as that. There is not just one advertising company tracking me across every web site I visit. There are… many advertising companies tracking me across every web site I visit. Some of them don’t even do advertising, they just do tracking, and they sell that tracking data to advertisers who supposedly use it to do better targeting.

This whole ecosystem is amazing. Let’s look at online news web sites. Why do they load so slowly nowadays? Trackers. No, not ads – trackers. They only have a few ads, which mostly don’t take that long to load. But they have a lot of trackers, because each tracker will pay them a tiny bit of money to be allowed to track each page view. If you’re a giant publisher teetering on the edge of bankruptcy and you have 25 trackers on your web site already, but tracker company #26 calls you and says they’ll pay you $50k a year if you add their tracker too, are you going to say no? Your page runs like sludge already, so making it 1/25th more sludgy won’t change anything, but that $50k might.

(„Ad blockers“ remove annoying ads, but they also speed up the web, mostly because they remove trackers. Embarrassingly, the trackers themselves don’t even need to cause a slowdown, but they always do, because their developers are invariably idiots who each need to load thousands of lines of javascript to do what could be done in two. But that’s another story.)

Then the ad sellers, and ad networks, buy the tracking data from all the trackers. The more tracking data they have, the better they can target ads, right? I guess.

The brilliant bit here is that each of the trackers has a bit of data about you, but not all of it, because not every tracker is on every web site. But on the other hand, cross-referencing individuals between trackers is kinda hard, because none of them wants to give away their secret sauce. So each ad seller tries their best to cross-reference the data from all the tracker data they buy, but it mostly doesn’t work. Let’s say there are 25 trackers each tracking a million users, probably with a ton of overlap. In a sane world we’d guess that there are, at most, a few million distinct users. But in an insane world where you can’t prove if there’s an overlap, it could be as many as 25 million distinct users! The more tracker data your ad network buys, the more information you have! Probably! And that means better targeting! Maybe! And so you should buy ads from our network instead of the other network with less data! I guess!

None of this works. They are still trying to sell me car insurance for my subway ride.

It’s not just ads

That’s a lot about profiling for ad targeting, which obviously doesn’t work, if anyone would just stop and look at it. But there are way too many people incentivized to believe otherwise. Meanwhile, if you care about your privacy, all that matters is they’re still collecting your personal information whether it works or not.

What about content recommendation algorithms though? Do those work?

Obviously not. I mean, have you tried them. Seriously.

That’s not quite fair. There are a few things that work. Pandora’s music recommendations are surprisingly good, but they are doing it in a very non-obvious way. The obvious way is to take the playlist of all the songs your users listen to, blast it all into an ML training dataset, and then use that to produce a new playlist for new users based on… uh… their… profile? Well, they don’t have a profile yet because they just joined. Perhaps based on the first few songs they select manually? Maybe, but they probably started with either a really popular song, which tells you nothing, or a really obscure song to test the thoroughness of your library, which tells you less than nothing.

(I’m pretty sure this is how Mixcloud works. After each mix, it tries to find the „most similar“ mix to continue with. Usually this is someone else’s upload of the exact same mix. Then the „most similar“ mix to that one is the first one, so it does that. Great job, machine learning, keep it up.)

That leads us to the „random song followed by thumbs up/down“ system that everyone uses. But everyone sucks, except Pandora. Why? Apparently because Pandora spent a lot of time hand-coding a bunch of music characteristics and writing a „real algorithm“ (as opposed to ML) that tries to generate playlists based on the right combinations of those characteristics.

In that sense, Pandora isn’t pure ML. It often converges on a playlist you’ll like within one or two thumbs up/down operations, because you’re navigating through a multidimensional interconnected network of songs that people encoded the hard way, not a massive matrix of mediocre playlists scraped from average people who put no effort into generating those playlists in the first place. Pandora is bad at a lot of things (especially „availability in Canada“) but their music recommendations are top notch.

Just one catch. If Pandora can figure out a good playlist based on a starter song and one or two thumbs up/down clicks, then… I guess it’s not profiling you. They didn’t need your personal information either.

Netflix

While we’re here, I just want to rant about Netflix, which is an odd case of starting off with a really good recommendation algorithm and then making it worse on purpose.

Once upon a time, there was the Netflix prize, which granted $1 million to the best team that could predict people’s movie ratings, based on their past ratings, with better accuracy than Netflix could themselves. (This not-so-shockingly resulted in a privacy fiasco when it turned out you could de-anonymize the data set that they publicly released, oops. Well, that’s what you get when you long-term store people’s personal information in a database.)

Netflix believed their business depended on a good recommendation algorithm. It was already pretty good: I remember using Netflix around 10 years ago and getting several recommendations for things I would never have discovered, but which I turned out to like. That hasn’t happened to me on Netflix in a long, long time.

As the story goes, once upon a time Netflix was a DVD-by-mail service. DVD-by-mail is really slow, so it was absolutely essential that at least one of this week’s DVDs was good enough to entertain you for your Friday night movie. Too many Fridays with only bad movies, and you’d surely unsubscribe. A good recommendation system was key. (I guess there was also some interesting math around trying to make sure to rent out as much of the inventory as possible each week, since having a zillion copies of the most recent blockbuster, which would be popular this month and then die out next month, was not really viable.)

Eventually though, Netflix moved online, and the cost of a bad recommendation was much less: just stop watching and switch to a new movie. Moreover, it was perfectly fine if everyone watched the same blockbuster. In fact, it was better, because they could cache it at your ISP and caches always work better if people are boring and average.

Worse, as the story goes, Netflix noticed a pattern: the more hours people watch, the less likely they are to cancel. (This makes sense: the more hours you spend on Netflix, the more you feel like you „need“ it.) And with new people trying the service at a fixed or proportional rate, higher retention translates directly to faster growth.

When I heard this was also when I learned the word „satisficing,“ which essentially means searching through sludge not for the best option, but for a good enough option. Nowadays Netflix isn’t about finding the best movie, it’s about satisficing. If it has the choice between an award-winning movie that you 80% might like or 20% might hate, and a mainstream movie that’s 0% special but you 99% won’t hate, it will recommend the second one every time. Outliers are bad for business.

The thing is, you don’t need a risky, privacy-invading profile to recommend a mainstream movie. Mainstream movies are specially designed to be inoffensive to just about everyone. My Netflix recommendations screen is no longer „Recommended for you,“ it’s „New Releases,“ and then „Trending Now,“ and „Watch it again.“

As promised, Netflix paid out their $1 million prize to buy the winning recommendation algorithm, which was even better than their old one. But they didn’t use it, they threw it away.

Some very expensive A/B testers determined that this is what makes me watch the most hours of mindless TV. Their revenues keep going up. And they don’t even need to invade my privacy to do it.

Who am I to say they’re wrong?

https://apenwarr.ca/log/20190201

45 Techniques Used by Data Scientists

These techniques cover most of what data scientists and related practitioners are using in their daily activities, whether they use solutions offered by a vendor, or whether they design proprietary tools. When you click on any of the 45 links below, you will find a selection of articles related to the entry in question. Most of these articles are hard to find with a Google search, so in some ways this gives you access to the hidden literature on data science, machine learning, and statistical science. Many of these articles are fundamental to understanding the technique in question, and come with further references and source code.

Starred techniques (marked with a *) belong to what I call deep data science, a branch of data science that has little if any overlap with closely related fields such as machine learning, computer science, operations research, mathematics, or statistics. Even classical machine learning and statistical techniques such as clustering, density estimation,  or tests of hypotheses, have model-free, data-driven, robust versions designed for automated processing (as in machine-to-machine communications), and thus also belong to deep data science. However, these techniques are not starred here, as the standard versions of these techniques are more well known (and unfortunately more used) than the deep data science equivalent.

To learn more about deep data science,  click here. Note that unlike deep learning, deep data science is not the intersection of data science and artificial intelligence; however, the analogy between deep data science and deep learning is not completely meaningless, in the sense that both deal with automation.

Also, to discover in which contexts and applications the 40 techniques below are used, I invite you to read the following articles:

Finally, when using a technique, you need to test its performance. Read this article about 11 Important Model Evaluation Techniques Everyone Should Know.

The 40 data science techniques

  1. Linear Regression
  2. Logistic Regression
  3. Jackknife Regression *
  4. Density Estimation
  5. Confidence Interval
  6. Test of Hypotheses
  7. Pattern Recognition
  8. Clustering – (aka Unsupervised Learning)
  9. Supervised Learning
  10. Time Series
  11. Decision Trees
  12. Random Numbers
  13. Monte-Carlo Simulation
  14. Bayesian Statistics
  15. Naive Bayes
  16. Principal Component Analysis – (PCA)
  17. Ensembles
  18. Neural Networks
  19. Support Vector Machine – (SVM)
  20. Nearest Neighbors – (k-NN)
  21. Feature Selection – (aka Variable Reduction)
  22. Indexation / Cataloguing *
  23. (Geo-) Spatial Modeling
  24. Recommendation Engine *
  25. Search Engine *
  26. Attribution Modeling *
  27. Collaborative Filtering *
  28. Rule System
  29. Linkage Analysis
  30. Association Rules
  31. Scoring Engine
  32. Segmentation
  33. Predictive Modeling
  34. Graphs
  35. Deep Learning
  36. Game Theory
  37. Imputation
  38. Survival Analysis
  39. Arbitrage
  40. Lift Modeling
  41. Yield Optimization
  42. Cross-Validation
  43. Model Fitting
  44. Relevancy Algorithm *
  45. Experimental Design

Source: https://www.datasciencecentral.com/profiles/blogs/40-techniques-used-by-data-scientists

Important cybersecurity terms even your non-tech employees need to know

Cyberattacks continue to grow in scale, ferocity, and audacity. No one is safe. Large corporations are a target because hackers see the potential payoff as huge. Small companies are vulnerable too because they don’t have the financial muscle needed to invest in sophisticated security systems. Now more than ever, businesses must do whatever it takes to keep their data and tech infrastructure safe. If non-techie employees understand key cybersecurity terms, they’ll have a much better chance of making the right security decisions. There are thousands of cybersecurity terms but no one (techie or otherwise) is under obligation to know all of them. Some terms are, however, more important than others and these are the ones all staff must be aware of.

Note that knowing these cybersecurity terms is more than just mastering the definitions. Rather, it’s being able to understand the patterns and behavior that define them.

Shutterstock

1. Adware

Adware is a set of programs installed without explicit user authorization that seek to inundate the user with ads. The primary aim of adware is to redirect search requests and URL clicks to advertising websites and data collection portals.

While adware mainly aims to advertise a product and monitor user browsing activity, it also slows down browsing speed, page-load speed, device performance, eats into metered data, and may even download malicious applications in the background.

2. Botnet

Shutterstock

Botnets are simply a collection of several (and they can number in the millions) Internet-enabled devices such as computers, smartphones, servers, routers, and IoT devices that are under a central command and control.

Botnets are infectious and can be propagated across multiple devices. Botnet is a portmanteau of “robot” and “network.” Some of the largest and most dramatic cyberattacks in recent times have involved botnets, including the destructive Mirai malware that infected IoT devices.

3. Cyber-espionage

When you hear the term espionage, what first comes to mind is the world in a bygone era. But espionage is as alive today as it was a century ago. The difference is that thanks to the proliferation of information technology and the ubiquity of the Internet, espionage can now be executed electronically and remotely.

Cyber-espionage is the gathering of confidential information online via illegal and unauthorized means. As you’d expect, the primary target of cyber-espionage is governments as well as large corporations. China has been in the news in this regard though other world powers such as the United States and Russia have been accused of doing the same at some point.

cybersecurity terms

4. Defense-in-depth

Defense-in-depth is a cybersecurity strategy that involves creating multiple layers of protection in order to protect the organization and its assets from attack. It’s born out of a realization that even with the best and most sophisticated technical controls, no security is ever 100 percent impenetrable.

With defense-in-depth, if one security control fails to prevent unauthorized access, the intruder will run into a new barrier. It’s unlikely that many hackers will have the knowledge and skills to surmount these multiple barriers.

5. End-to-end encryption

End-to-end encryption is a means of securing and protecting data that prevents unauthorized third parties from accessing it during rest or transmission. For instance, when you shop online and pay with your credit card, your computer or smartphone has to relay the credit card number you provide to the merchant for authentication and payment processing.

If your card details fall into the wrong hands, someone could use it to make purchases without your permission. By encrypting the data during transmission, you make it harder for third parties to access your confidential information.

6. Firewalls

A firewall is a defense mechanism that is meant to keep the bad guys from penetrating your network. It’s a virtual wall that protects servers and workstations from internal and external attack. It keeps tabs on access requests, user activity, and network traffic patterns in order to determine who can and cannot be allowed to interact with the network.

7. Hashing

Hashing is an algorithm for encrypting passwords from plain text into random strings of characters. It’s a form of security method that transforms fixed-length character strings into a shorter value that represents it. That way, if an intruder somehow got through to the password file or table, whatever they see will be text that is useless to them.

8. Identity theft

Identity theft is sometimes referred to as identity fraud. It’s the No. 1 reason why hackers seek to access confidential information and customer data especially from an organization. An identity thief hopes impersonate an individual by presenting the individual’s confidential records or authentication information as their own.

For example, an identity thief could steal credit card numbers, addresses, and email addresses then use that to fraudulently transact online, file for Social Security benefits, or submit an insurance claim.

9. Intrusion detection system (IDS)

It’s relatively uncommon for a cyberattack to be completely unprecedented or unknown in its form, pattern, and logic. From viruses to brute force attack, there are certain indicators that point to unusual activity. In addition, once your network is up and running, all network traffic and server activity will follow a relatively predictable pattern.

An IDS seeks to keep tabs on network traffic by quickly detecting malicious, suspicious, or anomalous activity before too much damage is done. The IDS blocks malicious traffic and sends an alert to the network administrator.

10. IP spoofing

IP address forgery or spoofing is an address-hijacking mechanism in which a third party pretends to be a trusted IP address in order to mimic a legitimate user’s identity, hijack an Internet browser, or otherwise gain access to a restricted network. It isn’t illegal for one to spoof an IP address. Some people do so in order to conceal their online activity and maintain anonymity (using tools such as Tor).

But IP spoofing is more often associated with illegal or malicious activity. So organizations should exercise caution and take appropriate precautions whenever they detect that a third party wants to connect to their network using a spoofed address.

11. Keylogger

Keylogger is short for keystroke logger. It’s a program that maintains a record of the keystrokes on your keyboard. The keylogger saves the log in a file, then encrypts and distributes it. While a keylogging algorithm can be used for good (some text-to-voice apps for example use keylogging mechanism to capture and translate user activity) keyloggers are often a form of malware.

A keylogger in the hands of nefarious persons is a destructive tool and is perhaps the most powerful weapon of infiltration a hacker can have. Remember, the keylogger will capture all key information such as user names, passwords, PINs, pattern locks, and financial information. With this data, the hacker can easily access your systems without breaking a sweat.

12. Malware

Malware is one of the cybersecurity terms you will hear the most often. It’s a catch-all word that describes all malicious programs including viruses, Trojans, spyware, adware, ransomware, and keyloggers. It’s any program that takes over some or all of the computing functions of a target computer for ill intent. Some malware is just little more than a nuisance but in many cases, malware is part of a wider hacking and data extraction scheme

13. Password sniffing

cybersecurity terms

Password sniffing is the process of intercepting and reading through the transmission of a data packet that includes one or more passwords. Given the volume of network traffic relayed per second, password sniffing is most effectively done by an application referred to as a password sniffer. The sniffer captures and stores the password string for malicious and illegal purposes.

14. Pharming

Pharming is the malicious redirection of a user to a fraudulent site that has colors, design, and features that look very similar to the original legitimate website. A user will unsuspectingly key in their data into the fake website’s input forms only to realize days, weeks, or months later that the site they gave their information to was harvesting their data to commit fraud.

15. Phishing

Phishing is a form of social engineering and the most common type of cyberattack. Every day, more than 100 billion phishing emails are sent out globally. Phishing emails purport to originate from a credible recognizable sender such as e-Bay or Amazon or financial institutions. The email will trick the recipient into sharing their username and password on what they believe is a legitimate website but is in reality a website maintained by cyberattackers.

Knowing these cybersecurity terms is a first step in preventing cyberattacks

While technical controls are crucial, employees are the weakest link in your security architecture. Nothing makes employees better prepared for a cyberattack than security training and awareness. For most organizations, the IT department represents only a fraction of the entire workforce.

Tech staff can therefore not be everywhere to explain cybersecurity terms and help each employee make security-conscious decisions. Therefore, making sure your non-techie staff is familiar with these cybersecurity terms is fundamental.

http://techgenix.com/15-cybersecurity-terms/

Beapy uses NSA’s DoublePulsar EternalBlue & Mimikatz to collect and use passwords to mine for cryptocurrency following Coinhive

Two years after highly classified exploits built by the National Security Agency were stolen and published, hackers are still using the tools for nefarious reasons.

Security researchers at Symantec say they’ve seen a recent spike in a new malware, dubbed Beapy, which uses the leaked hacking tools to spread like wildfire across corporate networks to enslave computers into running mining code to generate cryptocurrency.

Beapy was first spotted in January but rocketed to more than 12,000 unique infections across 732 organizations since March, said Alan Neville, Symantec’s lead researcher on Beapy, in an email to TechCrunch. The malware almost exclusively targets enterprises, host to large numbers of computers, which when infected with cryptocurrency mining malware can generate sizable sums of money.

The malware relies on someone in the company opening a malicious email. Once opened, the malware drops the NSA-developed DoublePulsar malware to create a persistent backdoor on the infected computer, and uses the NSA’s EternalBlue exploit to spread laterally throughout the network. These are the same exploits that helped spread the WannaCry ransomware in 2017. Once the computers on the network are backdoored, the Beapy malware is pulled from the hacker’s command and control server to infect each computer with the mining software.

Not only does Beapy use the NSA’s exploits to spread, it also uses Mimikatz, an open-source credential stealer, to collect and use passwords from infected computers to navigate its way across the network.

According to the researchers, more than 80 percent of Beapy’s infections are in China.

Hijacking computers to mine for cryptocurrency — known as cryptojacking — has been on the decline in recent months, partially following the shutdown of Coinhive, a popular mining tool. Hackers are finding the rewards fluctuate greatly depending on the value of the cryptocurrency. But cryptojacking remains a more stable source of revenue than the hit-and-miss results of ransomware.

In September, some 919,000 computers were vulnerable to EternalBlue attacks — many of which were exploited for mining cryptocurrency. Today, that figure has risen to more than a million.

Typically cryptojackers exploit vulnerabilities in websites, which, when opened on a user’s browser, uses the computer’s processing power to generate cryptocurrency. But file-based cryptojacking is far more efficient and faster, allowing the hackers to make more money.

In a single month, file-based mining can generate up to $750,000, Symantec researchers estimate, compared to just $30,000 from a browser-based mining operation.

Cryptojacking might seem like a victimless crime — no data is stolen and files aren’t encrypted, but Symantec says the mining campaigns can slow down computers and cause device degradation.

A new cryptocurrency mining malware uses leaked NSA exploits to spread across enterprise networks

Sensorvault Googles Location Database – using cellphone users’ locations into a digital dragnet for law enforcement

The warrants, which draw on an enormous Google database employees call Sensorvault, turn the business of tracking cellphone users’ locations into a digital dragnet for law enforcement. In an era of ubiquitous data gathering by tech companies, it is just the latest example of how personal information — where you go, who your friends are, what you read, eat and watch, and when you do it — is being used for purposes many people never expected. As privacy concerns have mounted among consumers, policymakers and regulators, tech companies have come under intensifying scrutiny over their data collection practices.

The Arizona case demonstrates the promise and perils of the new investigative technique, whose use has risen sharply in the past six months, according to Google employees familiar with the requests. It can help solve crimes. But it can also snare innocent people.

https://www.seattletimes.com/nation-world/tracking-phones-google-is-a-dragnet-for-the-police/