Information Gathering Methods
Information gathering is not as challenging as it used to be a few years ago when one would only get details about a target either directly from the target or from asking around. The internet, more specifically the use of social media, has simplified this stage with newer and faster techniques of data collection. In the process of data gathering, no piece of data is said to be irrelevant. Just a small bit of information, such as a target’s favorite joint, might be sufficient to enable the social engineer to succeed at convincing the target to act in a certain way. It is important for a social engineer to know what type of information to look for. There is an information overload and lots of irrelevant information may be collected. It is also good to know the sources where this type of information can be found. Having information is not enough, it is important to know how to use information collected to profile a target and make them more predictable
Information gathering can be done in two broad categories of methods—technical and nontechnical methods. As the name suggests, the technical methods are reliant on computer-aided techniques of collecting information. There is, however, no assurance that a particular tool or piece of electronic equipment will obtain sufficient information about a target. Therefore, a mix of the following tools and devices might be used to gather information about targets. Social engineers will use multiple information-gathering tools/techniques and merge the information they get to build a profile for their targets.
Technical information -gathering methods
There are many tools being developed today purposefully for information gathering during social engineering attacks. Arguably the most successful tool for this is a Linux Distribution called Kali. It contains a suite of over 300 tools specifically designed to gather information about a target. From the 300, let’s narrow them down to the two most popular tools that stand out from the list in that they do not collect the data, but help in the storage and retrieval of it. These are as follows:
The following figure is a screenshot from the www.kali.org website where you can download Kali Linux and use the following tools:
BasKet is a free and open-source Linux program that works more like an advanced data storage tool to aid a social engineer in the data-gathering process. It has the familiar appearance of Notepad, but comes with a lot of functionalities. It serves as a repository for textual and graphic information that a social engineer collects on a particular target. It may appear simple or even unnecessary during a social engineering attack, but it actually serves a purpose that is hard to replicate in word processors such as Microsoft Word. BasKet uses a tab-like layout to enable the social engineer to place each type of information about a target in an orderly way that is easy to read or retrieve.
For example, pictures could be in one tab, contact information in another, social media information on a third one, and physical location details in a separate one. A social engineer will keep on updating these tabs whenever they come across more information. At the end of the process, BasKet allows the social engineer to export this information as an HTML page whereby it compresses all the information together making it more portable, accessible, and shareable.
Dradis(https://dradisframework.com/ce/) is a free and open source Linux, Windows, and macOS application that is used in the storage of information. It has a more advanced look that has BasKet’s notepad like appearance. Dradis is also more advanced in its functionalities in that it acts as a centralized repository and uses a web-based UI to enable users to interact with it.
Instead of tabs (such as BasKet), Dradis uses branches that allow a user to add different types of information together. Dradis can handle huge amounts of data that would otherwise be problematic for BasKet. It is, therefore, commonly used when there is a lot of information that is to be sorted by the target.
Having done with the two major data storage tools, it is now time to look at the ways through which social engineers gather the information. The following is a discussion of these:
One of the hives containing information about targets is corporate and personal websites. Corporate websites may contain information about their staff and clients. Personal websites, on the other hand, contain information purely about individuals. With enough digging around, websites may be able to reveal a lot of information. Personal websites may tell an individual’s engagements in terms of work, physical location, contact information, and some special words that may be used in profiling passwords.
Concerning the last point, it is known that for the sake of familiarity, people tend to include some phrases or words that are familiar to them such as the date of birth, partner’s name, pet’s name, or their own names. Corporate websites are able to provide biographies of their staff, especially the high-ranking ones and their work contact information. If you wanted to target the organization with a malicious email attachment, sending it to email addresses provided on a corporate website has a higher chance of delivering the payload directly inside the organization.
It is said that the internet never forgets. If you want to know something, knowing the right way to ask might get you almost all the information you want. Google, the dominant search engine, is a key tool for a social engineer that is used to unearth information about targets on the internet. We will go over some of the search phrases that social engineers use to hunt for information about targets using Google:
- To search for a target’s information within a specific domain such as a corporate site, the following query could be used:
Site: www.websitename.com "John Doe"
If anything about John Doe is contained in the website, Google will index it in the search results of the query.
- To search for a target’s information in the title of any website indexed by Google, the following query is used
It is important to understand that the spacing between the two words instructs Google to also search for titles that have John and are followed by text containing the word Doe. This is a very useful query since it will capture a target’s info contained in titles of multiple websites. This query will yield information from corporate sites to social media platforms because they will often use a person’s name as a title in some pages.
- To search for a target’s information in the URL of any website, the following query can be supplied to Google:
It is a common practice in many organizations to use words in web titles in URLs for SEO purposes. This query identifies a person’s name from URLs indexed by Google. It is important to note that the query will search for johnin URLs and doe in a similar manner to the one discussed previously. If at all a social engineer wants to search for all the target’s names in the URL rather than one in the URL and another in the text, the following query can be used:
The query will restrict results to those where the URL includes both the name John and Doe.
- In many more instances than not, a target will have applied for jobs using job boards. Some job boards retain the target’s curriculum vitae on their websites. Also, some organizations retain the curriculum vitae of their job applicants on their sites. A curriculum vitae contains highly sensitive details about a person. It contains the person’s real name, real phone number, real email address, educational background, and work history. It has a wealth of information that is very useful for a social engineering attack. To search for a target’s private details, the social engineer can use the following query:
"John Doe" intitle:"curriculum vitae" "phone" "address" "email"
It is a very powerful query that will scour the whole internet for information about John Doe that has titles with information such as curriculum vitae, phone number, email, and postal address.
- The following query is used to gather information, not about a particular person, but rather an organization. It targets confidential releases of information within the organization that may be posted on websites:
intitle:"not for distribution" "confidential" site:websitename.com
The query will search for anything posted with the title not for distribution or confidential in a website. This search may unearth information that some employees of the organization might not be even aware of. It is a very useful query in a social engineering attack when a social engineer wants to appear informed about internal matters of an organization to a certain target.
- One of the commonly used pretexts to enter guarded premises is that of an IT or networking repair person contacted urgently by the company. Guards will be ready to let in such a person and they will be able to carry out an attack in the midst of other employees without raising alarms. To be able to take such a pretext, a social engineer needs to be knowledgeable about the internal network or infrastructure of the organization. The following is a group of search queries that might give this information to a social engineer:
Intitle:"Network Vulnerability Assessment Report" Intitle:"Host Vulnerability summary report"
This information can also be used in certain parts of the attack since it also reveals weaknesses that can be exploited in the target’s network or in the hosts connected to the network.
- To search for passwords used by users in an organizational network, a backup of these passwords could be a useful place to begin to search. As such, the following query might come in handy:
Site:websitename.com filetype:SQL ("password values" || "passwd" || "old passwords" || "passwords" "user password")
This query looks for SQL files stored in a website’s domain that have the name password values, password, old passwords, passwords, or user password. These files, even though they may not have the user’s current passwords, they may give enough information to an attacker to profile the current passwords of the users. For example, there is a high chance that an employee’s old email password will change to a new password.
There are many other data-hunting queries that can be used in Google and other search engines. The ones discussed are just the most commonly used. A word of caution is that the internet never forgets and even when some information is deleted, there are other sites that store cached files on the website. Therefore, it is best for organizations not to publicly post their sensitive information.
Another commonly used tool to gather information about a target is Pipl (https://pipl.com). Pipl archives information about people and offers it for free to whoever wishes to access it. It stores information such as a person’s real names, email address, phone number, physical address, and social media accounts. Alongside this, it offers a paid option to collect information about a person’s relatives starting with their siblings and parents. It is a goldmine for social engineers since with very little effort, they are able to access a ton of information about their targets. Let us take a real-life example instead of the commonly used John Doe, which may have many results. Let us use an uncommon one such as Erdal Ozkaya:
The site indexes a number of results for the names we have searched, let us explore the first result. It is of an Erdal Ozkaya, a 40-year-old male from Sydney, Australia. The site offers us sponsored links to find vital records, contact details, and username reports. Let us go ahead and click on the name and see what the site has about Erdal Ozkaya that is available free of charge:
The site is able to pull out more information about this name. We now know that he (in this case me) is working as Cybersecurity Architect at Microsoft, he has a PhD in cybersecurity and a master’s degree in security from Charles Sturt University and he is associated or related with some people, which I blurred out for privacy reasons, who might be his parents and siblings. From a total stranger, we now know a lot about him and have information that we can use to gather more about him.
From here, it is easy to hunt for more information using the special Google queries we discussed earlier. You can go ahead to find his CV, which will contain more contact information.
From our example, we have explored some of the capabilities of Pipl when it comes to hunting for information about targets. With such sites available to anyone, it is clear that privacy is nothing more than an illusion. Sites such as these get their information from social media platforms, corporate websites, data sold by third parties, data released by hackers, data stolen from other websites, and even data held by government agencies.
This particular site is able to get criminal records about a target, which means it has access to some felony records. What is worrying is that these sites are not illegal and will continue adding data about people for a long time to come. It is good news to a social engineer, but bad news to anyone else that may be a target. The site owners cannot be compelled to remove the data they contain and therefore once your data gets to them, there is no way you can hide. The site can only get stronger with more information.
Still on the sites that archive information, Whois.net is yet another one that serves almost the same purpose as Pipl. Whois.net lists information such as the email addresses, telephone numbers, and IP addresses of targets that one searches information about. Whois.net also has access to information about domains. If a target has a personal website, Whois.net is able to find out fine details about the registrant and registrar of the domain name, its registration and expiry date, and the contact information of the site owner. Just like Pipl, the information obtained here could be used to obtain more information about a target and thereby be able to launch a successful attack.
Billions of people have embraced social media so far. Using these platforms, social engineers can find a ton of information about most of their targets. Most targets will have Facebook, Twitter, Instagram, or LinkedIn accounts. The beauty of social media is that it encourages users to share personal details of their lives on the internet. Social media users are conveniently careless and end up giving out even sensitive pieces of information to the whole world without thinking about the consequences. It is clear from these statements that social media is doing nothing more than compounding the problem. It is creating a rich pool of information from where social engineers can fish out details about targets without arousing suspicion.
The following is a screenshot from Facebook, which gives lots of information publicly:
Within a couple of minutes of searching on multiple social media platforms, a social engineer is able to gather the target’s hobbies, place of work, likes and dislikes, relatives, and more private information. Social media users are ready to brag that they are off for holidays, they work at certain places, they do certain jobs at their workplaces, their new cars, and the schools to which they take their kids. They are not scared of showing their work badges on these social media sites, badges that social engineers could duplicate and use to get into organizations with.
Social media users will also befriend or follow strangers provided that they match hobbies and interests. It is a crazy world there that puts potential targets at a disadvantage since these sites are designed to make people open up to strangers on the internet. Information that was traditionally stored for face-to-face conversations is now being put out for the world to see. The bad thing is that both well and ill-motivated people are accessing it.
This information could be used by a social engineer to profile a target. This information may come in handy when convincing a target to take some actions or divulge some information. Let us take a hypothetical example that we are social engineers and want to get top secret designs and specifications from a US military contractor so that we may learn how to compromise their equipment.
We can start by going into a social media platform such as LinkedIn and searching for the name of that company. If the company is on LinkedIn, we will be shown the company profile and a list of people that have listed on LinkedIn that they work there. Next, we identify an employee that works in the research and design department or even the marketing department.
We then concentrate on getting information about this target that might help us in putting them in a position in which they can divulge the secret information that we are looking for. We start by searching the employee’s Facebook profile to find hobbies, interests, and other personal details. We proceed to Instagram and take a look at the type of pictures that the employee posts.
We start geolocating the target, associating the information we get on all social media accounts in his name. We get to a point that we find out his physical address and the places he likes to spend time at. We approach him at that point and use one of the tactics learned earlier in the mind tricks and persuasion chapters to get him to insert a malware-loaded USB drive into his computer. From there, the malware will start harvesting for us the information that we want. It is as easy as that.
The following is some public information about myself in LinkedIn:
Organizations are being targeted in a similar fashion. In early 2017, 10,000 US employees were spear phished using social media by Russian hackers that planted malware on social media posts and messages. In mid-2017, a fake persona of a girl by the name Mia Ash was created and used to attack a networking firm by targeting a male employee with extensive rights in the organization. The attack was foiled only because the organization had strong controls to protect itself from malware. The male employee had already fallen for the con set out using the girl’s fake Facebook account.
In August 2016, it was discovered that there was a massive-scale financial fraud targeted at customers that had followed a certain bank on social media. It is believed that attackers were able to take control of the bank’s social media accounts and send fraudulent offers to the followers who only ended up losing money. There are many other social media mediated social engineering attacks that have happened. All that is to blame is the quick availability of private information on social media.
The Top 10 Worst Social Media Cyber-Attacks, by S. Wolfe, Infosecurity Magazine, 2017 available at
https://www.infosecurity-magazine.com/blogs/top-10-worst-social-media-cyber/. [Accessed on December 13, 2017].
This article has been taken from my Learn Social Engineering book. To learn more about Social Engineering you can read my award-winning Learn Social Engineering Book
You can get it via
My Books : https://www.erdalozkaya.com/about-erdal-ozkaya/my-books/
6 Best New Social Engineering Books To Read In 2019
I enjoyed your article and the details you mentioned is very useful, Gathering information is challenging for the goals. I enjoyed your blog and learn a lot of things from it and I am new here but want to say thank you for your efforts.
Thank you for sharing your articles for gathering information for target goals. The methods technical and non-technical here are explained here and it helps me. I collect and get a lot of information about information gathering and follow you for a long time. I will share your article with my friends.
I have read some excellent stuff here. Certainly worth bookmarking for revisiting.
I wonder how a lot attempt you put to make this sort of great informative website.
I have to thank you for the efforts you have put in penning this blog. I am hoping to view the same high-grade content
by you later on as well. In truth, your creative writing abilities has inspired me to get my very own site