SOC is an acronym for Systems and Organization Controls – a popular standard for enterprise applications. The SOC2 certification puts a stamp of approval on an organization’s control of security and builds trust with customers. Every organization needs to build enterprise applications with SOC2 compliance in mind. Read below a conversation from our series ‘Net Talks’ on YouTube. Joseph Jude (CTO at Net Solutions), Harish Verma (Senior Architect at Net Solutions), and Amit Manchanda (Senior Technical Project Manager at Net Solutions) share how they designed and developed SOC2-compliant applications for two enterprise clients and how they conducted the BCP to measure the availability of these certifications.
You can view the entire conversation here:
Joseph: Hello and welcome to Net Talks. I’m Joseph Jude, Chief Technology Officer at Net Solutions.
Harish: Hi, I am Harish Verma, working as a Senior Technical Architect with Net Solutions.
Amit: Hello everyone, I’m Amit Manchanda. I’m a senior technical project manager at Net Solutions with over eight and plus experience.
Joseph: Thank you, Harish and Amit, for joining us. SOC stands for Systems and Organization Controls. Probably it is the most common security standard for enterprise applications. SOC2 certification puts a stamp of approval on the organizational control on security. Hence it builds trust with customers.
SOC measures three parts of the organization: people, process, and technology. As a technology company which designs, develops and monitors customers’ applications, we are at the third pillar of the certification. Recently, we were involved with two of our enterprise clients to get their applications SOC2-certified. Three of us are here to talk about how we designed and developed these applications plus how we conducted the BCP so as to measure the availability for SOC2 certifications.
Harish, can you please talk about the project?
Harish: Sure, Joseph. The client we’re talking about deals with licensing part of applications. It involves deal management, disclosure management, royalties, and then the payments. We get the information from the customers and process the data. There is personal information of the users as well and we make sure that it is stored safely and is transmitted in a secured manner.
Joseph: I can understand that since there are both payment details and the personal details of their customers, security will be key part of this particular application. Thank you. Amit, what about you?
Amit: Sure, Joseph. The client we’re working with is into the manufacturing of cleaning products and they’ve been into business for more than 60 years now. So, we at Net Solutions helped them build a digital solution that validates cleaning protocols and evidence-based technology. It helps in revealing the contamination on surfaces, which could be a desk or any other physical surface.
We also helped them build a successful iPad application that allows facility teams to plan, conduct, track and report site assessments regarding the quality of the cleaning services. We have also built several reports and dashboards that helps in communicating results for monitoring, feedback and continuous improvements.
Joseph: Okay, thank you. Harish, can you talk to me about the different levels of SOC certifications and what they mean?
Harish: Yes, Joseph. SOC (System and Organization Controls) has three levels of certifications: SOC1, SOC2, and SOC3. SOC1 is applicable if you’re dealing with the financial data of your customers. SOC2 is about security, i.e., whatever controls we’ve applied to the data we’ve collected and the systems that are in place now. Both of these reports are generated for the auditor. So, you don’t need to necessarily share them with the public. You only have the audits and share these reports with a registered CPA who certifies your organization.
SOC3 is more of a marketing tool. So, organizations that are SOC2 certified can get SOC3 certification and publish on their public websites to claim they are secure enough.
Joseph: Okay, thanks Harish. Amit, you have been involved in this particular application from the ground up. Can you talk to me how we took our SOC off the design itself? Also, now that you have been involved for quite some time, what would you do differently if you were to start designing from today with everything that you already know?
Amit: Joseph, if I go back three or four years back when the application started, I don’t think the SoC was in the picture in terms of any plans. So, but I think the basic engineering processes that we follow at Net Solutions really helped us to do the groundwork of SoC compliance. So, some of the artifacts like the security, privacy, and availability were actually covered as part of the SDD document, which you prepare for any application.
The technical implementations were already there in how before we went into the SOC process. So, they were already taken care through that process.
But, were there are some things that we had not factored into? Yes, there were a few items – certain requirements from the infra perspective like the backup policies, the monitoring tools like PRTG or something similar, the logging mechanism that we have not factored into initially. We kept on adding all the requirements that we had not factored earlier into the application so that the testing journey is fine-tuned and we are able to go through that process.
You talked about, is there anything which we could have, which we would do differently? Yes. I think there are certain processes that we need to follow in terms of sharing the evidences, which you need to capture regularly. It can’t happen that overnight you generate all those evidences somehow, right?
So, you need to start tracking them regularly and share all the artifacts with the auditing team at a monthly or bi-weekly periodicity. So, you need to keep track of those and keep on sharing those and obviously create those in parallel.
I would also like to add that sometimes there are business requirements that lead to certain changes in your application.
You may add a few new components, you may add new requirements, so all of that also needs to be kept documented and your artifacts needs to be updated in accordance to the changes that the business team is asking you to do in the application.
Joseph: Correct! Harish, would you like to add anything to what Amit has talked about?
Harish: Yeah, maybe I’ll continue from where Amit left. The most important point is that you should start keeping compliance in mind early, because whenever you are working on designing or developing a secure application, you have to start early. Later on, if you’re working on a few things, then it may take more time as compared to if you start early.
When considering a security compliance like SOC, you consider all the lessons, the compliance list, or the checklist that you can follow. For example, whenever you are designing any application, you have to make sure that you are managing the access controls on the system, whether it is the logical data or the physical devices. You have to make sure that the controls are in place so that only authorized access is allowed.
Then comes the change management. Whenever you are working on the change management, you work on your CI/CD pipelines, you work on your DevOps. You make sure that your pipelines are in place for each and every environment so that whenever you want to deploy something on your production environment in case a disaster struck so you know that you have the latest software and the latest data with you. You should also have your backup policies in place for your file systems and your data.
So, you should know all these points. If you’re covering your bases, it helps a lot later on as well. Your compliance or your auditing through that process will be really helpful.
Joseph: Thanks, Harish. If I were to summarize what you said, probably following a Shift-left policy of security including other factors that we consider like performance and everything helps a team lot better. To sum up what Amit said, having an engineering process which already takes care of SDD, DevOps, 12-factor methodology, etc. takes care of implementing SOC control from early on, rather than patch it at a later point of time.
Okay, we talked about design and development. Let’s shift to the BCP procedures that we conducted. Harish, would it be possible for you to walk me through the BCP procedure and how you prepared our side of the team for BCP?
Harish: Sure, Joseph. BCP means Business Continuity Plan. It’s a set of guidelines to help you recover your system as quickly as possible and prevent it from going down for significant time in case of a disaster to prevent business or operational losses.
Now, a plan is successful if you train your people and follow the guidelines. Otherwise, it’s a failure. So, you have to ensure you follow the guidelines.
First, you analyze your business set of requirements, parts of your business applications, and functional areas. You make sure that you’re aware of all those functional areas and the criticality of those operations. One operation or one area can be very business critical so that means you have to restore in on higher priority as compared to others.
Then, you have to involve all the stakeholders who are a part of your crisis management team. They can be your business users, your infrastructure guys, and your developers. So, you need to ensure that you have a proper team with clear roles and responsibilities, you know what steps to follow and in which order. It ensures that anything and everything we do is a part of the BCP process and the BCP is successful.
When you’re creating your BCP, there should be a checklist. You can create one checklist which we usually call a runbook or a playbook. There you have all the detailed steps and timings. Whenever you’re doing that, ensure that there’s a success criterion attached to it. That means at the end of that exercise, you can say that whether that step that you perform was a success or not because it would drive all your following actions as well.
When we’re working on BCP, there are two important terms: RTO and RPO.
RTO is recovery time objective, which is the time you take to restore back your systems before significant loss to your business. RPO is recovery point objective, which is the amount of data lost when your system is down and you can bear that loss without significant loss to your business operations.
While preparing a business continuity plan, we keep these things in mind, continuously practice, share the whole plan with stakeholders, and we keep on updating it. It’s a continuous process, not a one-time process. You create a plan, you follow it, and make it a regular practice.
Joseph: Lovely! I like what you said: “The plan is only good if we train everybody on that particular plan and it is executable. Otherwise, you just work a piece of paper.”
Amit, you also conducted BCP. Is there anything you would like to add or did differently than what Harish talked about?
Amit: Joseph, at a high level the planning and the execution has been similar to what Harish talked about. We also talked about all the activities that we are going to carry out on the day of the BCP drill. We talked about all the responsibility, all the key stakeholders that are going to perform those activities.
We also defined what is the sequence of those activities, like where to start and how we are going to conduct each activity. So, when the execution actually started, which happened on the day of the BCP drill, we took notes of each activity and took time and recorded the time it took for each activity, like against the time we planned for each activity.
Once the drill was conducted successfully, we did a retrospective to identify findings and gaps for what we had planned initially and the learnings we can carry forwards for the next BCP drill or whenever we have to actually implement the BCP in our project.
We also submitted an execution summary of all the steps that we executed, plus all the artifacts, which we took before the drill and after the drill to ensure that the drill is actually successful.
Joseph: True! One of the things that I’ve always seen and which is what we follow and which I insist that we follow is not only the planning part of it, not only the runbook that we prepare as a plan, but also the retro that happens so that we can improve upon whatever that we have done. There’s a continuous improvement because BCP routine is not one time. Because it checks the availability for SOC2 certification, this needs to be continuously done. So then when we have done a retro, that helps us to improve on the next iteration and subsequent iterations.
So, thank you, Amit. We talked about the planning and then what happens. Let’s go to that day of conducting the BCP. So, Amit, you already talked about some of the retro. I would want to hear some of the retros that you have done. But I want to go back to the beginning of the day.
How does that day look like? When do you start? How do you start? Amit, do you want to start answering that?
Amit: Sure. Basically, we have set a particular schedule, a time is decided when we are all going to meet on that particular day. There is a ticket trace that there is a disaster or some kind of an event that has happened, due to which the infrastructure or the application is not available.
We try to simulate an actual scenario or that event would trigger and how we are going to respond to that scenario. So, everybody comes onto a common bridge. We all ask team members who are supposed to be part of that grill. We all collectively join on that bridge and we start going through all the activities.
We declare, yes, there has been a disaster. There has been a downtime now. So, we need to start preparing for that. Immediately we start the execution process, taking up all the previous backups, which we have taken regularly and trying to simulate how we would actually respond to that event if that happens.
Joseph: Okay. Harish, how did your day look like? How long it took? Can you share some knowledge on that?
Harish: Sure Joseph. We are supporting two applications. They have their mobile counterparts as well, in turn four applications. The RTO for us in this case was 24 hours. It means you’ve to start collecting your evidences, perform all the activities in the runbook, make the new system up and running, collect the evidences again, compare it, and then make sure the system is working as it was working in the early scenario.
Because both the teams are in different time zones, the common time was agreed upon between multiple teams/stakeholders. We started early time of the U. S so there will be a more overlapping period as compared to if we start midday or something.
About steps, we started by scheduling a meeting in which all the key stakeholders were invited. Everyone was prepared and new what to do. The business and QC team started collecting evidences. Then, each team performed its activities. The infrastructure team, they created, uh, the required infrastructure and restored the data and the file backups.
The DevOps team ran their pipelines, made configuration changes, and they started their jobs. Once everything including the self-validation was done and the system was in place, we moved to the QC and the business validation. Once, it was done, we stored the evidences and generated a report for the auditors.
All these activities took us close to five hours. In this time, we were able to perform each and every step, record everything, verify everything and generate the report as well.
Joseph: Wonderful! Amit, you talked about some of the retros. So, I would be keen to know. Maybe you can share one of two findings from your retro.
Amit: One of the key findings was that we had set a scheduled time for our testing of the environment. But when we actually went into the testing activities, we found that we took about an hour more than what we had planned for. When we did the retro, we found out that instead of running the smoke test cases, we were doing a detailed regression round, right?
It meant that we were covering a lot many test cases then probably we should have intended for. So, the plan it took us was close to, let’s say it took us three hours instead of taking considering two hours. So, ideally, we should have done that activity in two hours. So, the learning for us was that we should focus on the key business requirements or the smoke test as if we can’t go into a detailed round of testing, right?
You need to focus on your smoke test to ensure that the system is up and running, compare the data obviously, and see, give a go ahead. So, that was one of the learnings.
Joseph: Fantastic. Fantastic. Thank you, Harish. Thank you, Amit, for sharing your knowledge on how to design and develop a SoC compliant application, as well as how to run a BCP procedure as part of evaluating the availability for SoC2 applications.
Harish: Thank you so much.
Amit: Thank you.
Joseph: Thank you for watching our NET Talks. If you have any requirements for developing enterprise applications, we would love to talk to you. Thank you.