Reverse Engineering Software

Summary

Reverse Engineering a Software Application is a process where you have an existing Software Application and have a Team conduct an analysis of the application, break it down to understand what it does and how it works. There are several possible goals for doing this, or a combination:

Redesigning the User Interface using more modern techniques
Redesigning the “Back End” - Business Processes and Data Tiers
Redesigning Integrations with other systems
Reducing Security Risks
Documenting the existing Legacy Application
Or, doing a Competitive Analysis on a Competitor's Product

‍

Why do we need to Reverse Engineer Software?

Obsolete Technology & Business Risk of Failure

This is probably the number 1 reason to Reverse Engineer an existing Legacy application.

Often you may have a “Mission Critical” system that has “worked” for the past 10 or even 20 years. However, over that time technology, web user interfaces, hardware, databases, integration methods, etc. have all evolved. Furthermore, the underlying technologies used (hardware, database, UI code, etc.) may no longer be supported by the OEM Vendor, or anyone for that matter.

Considerations to take into account:

Technology used in the Application is Obsolete / no longer Supported:

User Interface (UI) coding techniques. Some examples:
If your application uses Adobe Flash, then this portion of the UI will often fail if a user is using a modern updated Web Browser, as Adobe Flash has many known security flaws.
If your application was built to run on Microsoft Internet Explorer 6, 7, or 8; then it will not work on ANY modern updated Web Browser. During that period of time Microsoft used a custom version of Java Script, which is incompatible with modern Browsers, they simply can’t understand this custom implementation.

Use of Older Programming Languages for Business Processes. An example:
COBOL is a 60 year old programming language, and yet
90% of the Fortune 500 companies have systems that use it
And it is estimated that over 200 Billion lines of COBOL code are still in use
While it is “Keeping the Lights On”, try finding a COBOL Developer today to maintain your application. Most of them are in their 60’s and 70’s, if not already retired.
So, what do you do if the Application has a problem?

Legacy Applications running on Legacy Hardware.‍
20 or 30 years ago, we did not have the same level of redundancy built into the Application or Hardware Infrastructure.
Virtualization was in its infancy, the “Cloud” had yet to be created for mass use, creating Hot Fail-Over systems was extremely expensive, etc.
In addition, there is a known Mean Time Before Failure for every hardware component, so at some point we know it will suddenly stop working and fail.
The BIG question is when it Fails, how will it affect your Business Operations?

Unsupported Database Technology or Very Old Database Technologies‍
This is similar to the COBOL conundrum.
The Database that runs your Core Legacy Application, is not supported by the OEM Vendor, or maybe supported by a very small number of 3rd party vendors.
What happens if there is a problem, who do you turn to, where do you go?
The BIG question is when the Database Fails, how will it affect your Business Operations, and how do you mitigate this risk?

Real World Example: Airline Industry

Northwest Airlines - Mainframe failure of its Reservation System for 3 ½ hours on Mar 22, 2000, causing 130 flight cancelations and hundreds of others delayed, as employees resorted to manually checking in passengers.
Delta Airlines - Mainframe failure over 2 days, beginning on Jan 29, 2017, causing 280 flight cancellations and numerous flights delayed.
Delta Airlines - a 7 day IT outage during August of 2016, caused the airline to record a $150 million USD one-time hit to revenue.
Nearly every single Airline has been hit with similar problems, often multiple times over the years.
The biggest single factor is that most of them still operate their Ticketing Reservation System on Mainframe systems that are now 20 to 30 years old.

Real World Example: UTI.edu

In 1999, UTI developed a custom Student Information Management system (SMART), built on Progress Software - a 4 GL and database technology stack.
Unfortunately, Progress really never took off and was only adopted by a small portion of companies around the world.
Since it was a “custom” application that meant that we had to maintain it and make any enhancements to it. This was very challenging, as it was very difficult to find Software Developers or DBA’s who knew this technology.
Or vendors who could support us.
In November, 2008 the core database crashed, and could not be recovered right away. The core system was down for 3 days, as we recovered the system from backups, troubleshooted the data corruption that had occurred, and repaired the code to ensure this specific problem did not happen again.
That’s 3 days of business disruption in a $440 million USD publicly traded company.

‍

Business Need to Support New Devices

Technology will continue to evolve and develop further, and as Users the types of devices we use today to access a System will not be the same types of devices that we use 10 years from now. The old green screen CRT Terminal eventually gave way to the Mac / PC Windows style User Interface, which in turn gave way to a Web User Interface, and then was expanded to Tablets and Mobile Phones.

If the business application is purely an Internal facing Application, you can hold off for some time, before you need to update the UI. However, if the application is Outward facing, then there is a greater business need to ensure that the UI is kept up to date with whatever devices and UI approaches that your Users are using. For example, if your Application uses an installed PC based Windows Application, but all of your Sales Reps who need to access it are using an iPad Tablet; then you will need to Reverse Engineer the application to provide for a Web based or Native iOS Application.

‍

Slow Performance

The challenge here is that Users, especially Consumers, expect a system’s performance to keep pace with advances in technology. Moore’s Law famously predicted that computing power would double every 18 months. While we are reaching the limit of this after 50+ years, this will be replaced by Neven’s Law of quantum computing. So, the demand for faster systems and/or more capabilities will be ongoing.

How to determine when this type of Reverse Engineering process is Needed:

Trends over Time showing Customer abandonment of the application - meaning they drop off, instead of waiting
Data collected form Application Performance Monitoring and the processing of synthetic transactions that mimic a User’s behavior
Customer Complaints
Internal User Complaints

Difficulty Maintaining the Application

Just like with any machine, building, etc., the older it is the more difficult it is to maintain in good working order with a high efficiency. Parts break, need to be replaced. With a Software Application, you have the added challenge that over time, additional modifications and enhancements will have been made. However, often the original documentation of the system was never updated to reflect those changes. So, your Software Engineers maintaining the system, may not have all of the information they need to accurately and effectively maintain the system.

And because of the complexities involved with a software application, if they make a change or run a process based on incorrect data it can cause unintended effects and problems. All because they don’t know “Why” something works the way it does.

Real World Example: Meridian Learning Management System (LMS)

Meridian is a provider of Corporate LMS systems.
We had hired them to implement a LMS for our organization, which would be used to train our Automotive Technicians.
Once we were in a Pilot Phase of the implementation, we noticed something very odd. Every Thursday at exactly Midnight EST, the system would suddenly go off line for about 15 minutes, before recovering.
Being curious we naturally contacted our Meridian Team, who said that they had no documentation or reason why this was happening.

‍After it happened a 2nd time, we started doing our own research and investigation (Reverse Engineering) work:‍
We discovered that there was a System Job that would run at this exact moment, which would completely restart the Application Pool of the Web Server, and also restart several other processes.
Why? After conducting an analysis of their code, we found several Memory Leaks in their Core LMS Application.
And then we found out that one of their former Senior Developers knew about this, but never documented it, so to “fix” the problem before a crash occurred, he would simply have a System Job reboot everything to clear the Memory Leak.
But because none of this was documented or “known” by any of the other staff, if anyone would have suddenly stopped this System Job it would have been catastrophic.

Regulatory Changes

In some cases, Regulatory Changes may require your IT Department to completely Reverse Engineer an existing Legacy Application, because it simply cannot support the new changes that are required. Some typical examples include:

Regulatory Agency makes a significant change in how to integrate / interface with their Systems
Ex: changing from simple batch file processing, to requiring all companies to use their new Web Services API interface, and the current Application cannot support this.
Ex: New Regulatory Requirements, that data of a particular type must be stored in an encrypted format on the database, and your older Database platform does not support this level of encryption.

‍

Known Security Risk

This can occur either at the Software or at the Hardware layers. IT Professionals are constantly waging a war against Hackers who want to steal information and data, ransom money from a company, cause outages, or simply cause havoc within an organization for whatever reason. And the easiest way to do this is through “Known Vulnerabilities” in older systems. While the latest version of a Modern Database may be protected from a specific attack, and older version of the same Database may be very vulnerable.

Let’s use a Military Analogy on this one:

Scenario 1:

Hackers: Russian Amata T-14 Tank
You: US Abrams M-1 A2 SEP 3 Tank
Results: You probably win, as the Amata is not reliable at this point and is prone to breaking down, and the latest model of the Abrams is one of the top tanks in the world. Worst case, it is a Draw and you prevent the Hack, but have some repair work to do.

Scenario 2:

Hackers: Russian Amata T-14 Tank
You: US M-60 Tank - from the 1980’s
Results: You’re toast. You can’t even see them yet and they just blew your turret off and left you a flaming wreck.

Real World Example: Equifax Hack in 2017

Attack occurred between mid-May through July 2017, before being noticed
143 Million Customers were affected, disclosing:‍
Names
Birthdates
Addresses
Social Security Numbers
Driver’s License Numbers
Credit Card Numbers‍

Cost according to Equifax for this single Breach: $ 1.4 Billion USD‍
And to add Insult to Injury, the FBI actually notified Equifax earlier that year (prior to the Breach) that their systems had a known Security Flaw, which could be exploited.

Gray Hat: Reverse Engineering a Commercial Application

Frequently organization or Executives would like to better understand how a Competitor’s Commercial Application works and functions. So, they will use Reverse Engineering techniques to take apart an application that they have access to, in order to determine exactly what it does and how it works. This is definitely a very “Gray” area, and in some cases may be illegal, even if your intention is only to figure out how the system works.

Most User License Agreements include a Clause that prevents you from Reverse Engineering the application in any way. And if you do it with the specific intent of duplicating a Competitor’s functionality into your own application, then this normally crosses the line and would be illegal in most Western Democratic Countries.

However, there are times when you may be forced to Reverse Engineer a Commercial Off the Shelf Application. Normally, this might occur when the Vendor files for Bankruptcy with the intent to dissolve the Company. In this case, you may find yourself with an application that no one will support, and that you can’t replace quickly or easily. In these cases, you may be forced to hire staff to support the Application and Source Code yourself, if it was abandoned, so you may need to Reverse Engineer the Application. Always check with your Legal Counsel, in any of these areas, before proceeding.

Black Hat: Reverse Engineering an Application to Attack It

Sadly, not everyone in the World does things for a good, noble, or honest reason. Criminal hacking organizations, individuals, and even foreign Government Agencies will frequently take an application and Reverse Engineer it, in order to find vulnerabilities (internal or external) that they can exploit to attack the Application.

These attacks can take a multitude of forms, and if successful can be very damaging to your Organization and/or very profitable for the Hacker.

‍

Methods for Reverse Engineering Software

There are two primary methods for Reverse Engineering Software:

You have the Source Code

This is the easiest, as you have complete access to the actual Source Code and your team can go through it and read it (to the extent that they can). Keep in mind that this is not a perfect process, and can be very time consuming for the analysis to be performed due to:

Very Poor Naming conventions used in the code, business processes, and database design
Little or No documentation within the code itself.
Little or No documentation about the system as a whole, or how it works.

So, even if you have the actual Source Code, it may be very difficult to figure out what it actually does.

Real World Example: Honeywell - Project Management Add-On to Oracle Applications

Honeywell had a custom built add-on Application to Oracle Applications for their PMO in Phoenix, AZ.
At the time we needed to do a Y2K Test of this application, as it managed hundreds of millions of dollars in various projects, and considered to be a critical application to test.
So, we had to Reverse Engineer the application and document it.
We thought that this would be fairly straightforward, since we had the actual Source Code available to us.

We quickly ran into numerous Problems doing this:
Attributes in the code were named: A1, A2, B5, C7, D2, Z10, etc.
Business Methods looked like this:
R7 := ((D2 * A5) + G4) / Z10
If (H2 >= D8, THEN S8, ELSE Concatenate (“Error ” + E4 + T2))‍
You had to trace through every part of the code to figure out what each Attribute meant and what it did.
And there was absolutely no documentation on the system.
Worse the Database structure was not named in any human readable format:‍
Table 2100 - had columns A, B, C, D, E, F, G
Table 2110 - had columns A, B, C, D
Table 2150 - had columns A, B, C, D, E, F, G, H, I, J, K, L‍
Ultimately the project was a success, BUT it took significantly longer than what anyone expected due to the very poor design and lack of documentation.

‍

You don’t have the Source Code

In this case you don’t have physical access to the actual Source Code of the application. Meaning that your Development Team can’t instantly “Read” the code or get access to it. There are three main methods for getting access to the underlying Source Code.

Binary Reverse Engineering

If you have access to the actual Executable or Binary Files (such as a locally installed application), then you can Reverse Engineer or Decompile the Executable or Binary Files. This replicates the way the underlying Source Code looks, which can then be read and analyzed by a Developer.

However, in many cases special Application Tools are required in order for this to work correctly. In addition, this often requires a specialized skill set in order to accomplish. This is often done through either of these methods & tools:

Dissasembler - which breaks the Application into Machine Language (Binary), and then re-assembles it.
Decompiler - which takes the actual Executable and decompiles it into a higher level language, which can be understood by the Developer. However, this is rarely 100% perfect.

Analysis through Observation

This process can include both Automated and Manual processes to analyze and determine how a given Application works.

For lower level systems or embedded applications, often an automation tool will be used. Naturally, it is also best to get a QA Testing Suite or Tool from the OEM, to use against these systems or embedded applications, if possible.

If it is a Web Application, then frequently you would have a team interact with the application recording what occurs at each step, and what affects what other areas of the application. This can be very time consuming, and also extremely valuable depending upon the insights gained.

‍

Legality of Reverse Engineering Software USA

Always contact your Legal Counsel for specific advice on a given case before proceeding.

In the United States, it is normally lawful to Reverse Engineer a software application, as long as you have purchased and legally obtained a license to the software. This is especially true, if your intent is to define how the application interacts with another system or application, in order to use the two systems together. See Case Law: Bowers v. Baystate Technologies. This however, does not allow you to Reverse Engineer a 3rd Party Application in order to copy it, duplicate its code or design, etc. So, again always contact your Legal Counsel for specific advice.

We hope that you have enjoyed this blog article.

Thank you, David Annis.

Former CIO.

Schedule a Call with us

David Annis

David is a VP and Agile Coach within ArkusNexus, having served in multiple CIO, VP of Software Development roles. He assists our Sales, Marketing, and Operations Teams on critical initiatives.

dannis@arkusnexus.com