All posts by Ben Mawhinney

The App Chaps Digital Technical Information & Security Policies

Technical Information & Security Policies

 

Technical Contact: Ben Mawhinney – ben@dronelab.io

Infrastructure

What is the infrastructure & technology stack that powers your solution?

Our main stack is MERN (MongoDB, Express, React, Node), using the Express framework. Other languages such as Python are used for parts of the system where appropriate. MySQL and Mongo are used for our main data store, with Redis used for caching. Everything is hosted on Amazon Web Services with Cloudflare as the CDN and Argo where required. As every website and application are different, we look at the requirements of each project before putting it on the most appropriate service. We will use LightSail when the requirements are for a relatively lightweight deployment, all the way up to dedicated servers for high bandwidth/high traffic applications.

 

 

Where is your data geographically stored & is the data replicated to multiple geographical locations?

The application and data can be replicated in a range of locations if the requirement is for performance and redundancy. The main data and application location are usually hosted in either London, UK or Dublin, Ireland with replication across a number of availability zones if required. Each availability zone runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable.

What are the SLA’s around uptime?

We offer 99.98% uptime.

 

What is your Disaster Recovery strategy?

TAC configure automated daily backups on a per site/ per environment basis.

 

Typically, we will schedule a rolling database backup for each site (every 5 minutes), and these backups are encrypted and stored securely within Amazon S3.

 

From these backups ensure we can restore a website to any point within the past seven days in the event of data corruption.

 

This means that for a typical Drupal deployment (for example), the servers are backed up on a daily basis and can be rebuilt from Github using our CircleCI deployment mechanism.

 

In order to meet our 99.98% availability target, our databases are usually located in multiple availability zones for added stability and redundancy. Again, this is dependent on the requirements set out by the client and the business needs.

 

This typical configuration means that we are able to operate a highly available stack. In the event that one server goes down, the ‘spare’ server will pick up the load. This also allows for seamless upgrades of the environment.

 

We use alerts against the load balancer to identify any availability issues as they occur which are sent to the team via Slack and/or SMS.

 

A monthly uptime report is issued for each site as part of the ongoing support deliverables.

 

How are your systems protected?

Within AWS we operate within our own Virtual Private Network that keeps us separated from the other infrastructure they host. We operate rate limiting on our API to ensure we are protected against DDOS.

 

To ensure no-one gets access to this infrastructure we operate a no SSH, FTP, or SFTP policy for our servers and have no way to directly access them or data. We gracefully strip down machines and reprovision them if a change is required.

 

Within AWS we operate IAM roles to restrict our staff members from gaining access to unwanted parts of the system as well as keeping a log of all access activity.

 

How is the data backed up?

Within the company we operate a number of backup strategies:

  • Live replication ensures our data is distributed and available globally at any time, if a server cluster does fall out of service a new one is started and receives the data from another node.
  • Point-in-time backups ensure we can restore a database to any second in the past seven days in the event of data corruption.
  • Complete database backups are made frequently, encrypted and stored securely within Amazon’s cloud across a range of locations. These backups are irretrievable from the public internet and are safely stored for up to 3 months.

Is data encrypted at rest and in-transit? SSL?

All requests to our API are made via SSL. Dashboard logins and requests are passed securely using HTTPS. Dashboard passwords are encrypted using bcrypt. Our data sources are inaccessible to the public internet and are stored securely with AWS. Whether the messages sent through the platform are encrypted depends on the messaging platform used.

Is the data replicated to dev/staging, or is test data used?

We use test data in our dev and staging environments.

 

Data Access

What data will the company have access to?

Internal access to the databases is disabled by default, and access is only granted when a problem occurs that requires access.

Who will have access? Internal employees? External contractors?

In the event we require access to the database the only people to do so would be internal employees as we explicitly do not allow any external contractors access to this at any point.

Are there regular reviews of user accounts?

As a matter of course we do not require reviews of our user accounts as they are not given access to any part of the system they don’t need to complete their current task. At the start of each assignment we ensure they have access to what they need to complete it, at the end of this process it is revoked again.

What is the policy for removing user accounts?

IAM roles within our hosting infrastructure are removed when an employee leaves and outside contractors never receive access to infrastructure or codebase.

Accounts password policies?

All account passwords require at least one uppercase letter, one lowercase letter, one number, one non-alphanumeric character with a minimum length of 12 characters. The passwords are changed every 30 days. 2 Factor Authentication is also required, along with the password.

Who has administrative/super-user access to systems and data?

Mark Middleton is our Managing Director and Chief Technology Officer and is the only member of the team that will have complete access to all portions of the systems we allow access to.

The majority of our engineering team has access to a major portion of the administrative systems for good reason. We limit certain access to other individuals using IAM roles, and general staff have no access to the internals of the system at all.

Are there separate accounts for each user? Are any accounts shared?

All accounts we provide to our team are separate, nothing is shared and all access to our infrastructure is logged if and when they access it.

Is each login and access logged?

Each login and action performed is logged for any infrastructure or code changes. Direct server or database access is not allowed.

Who has remote access to the systems and data?

We have a policy of zero remote access to live systems and data, in the event of an issue with our servers we gracefully tear them down and rebuild them on demand.

Who has physical, onsite access to the systems and data?

As our platform is hosted in the cloud the only people to have direct physical access to our servers or data is Amazon; however even if they were to attempt to access the virtual machines within their physical boxes all elements are locked down.

Policy and Process

Method in which data is backed up? (i.e. hard drives, CDs, thumbdrives, etc.)

Data is backed up to our secure cloud. We also have plans in motion for extra redundancy by storing multiple regular copies of data across multiple locations. For example, we will use Amazon’s Glacier storage for data that is older than three months old, and is infrequently accessed.

Are there regular security audits of systems? Are they internal or external audits? Software development code reviews?

We perform regular code reviews on a day to day basis and run regular monthly audits internally of our cloud systems.

What is the company’s data retention policy?

Data is backed up daily and kept for 3 months. Again we have plans in motion to retain more granular backups for a longer period.

Will we have access to the data stored, if we request for it?

We offer no public services for access to raw data other than via the dashboard and the export functions available there, but we are open to agreements on a client by client basis.

In the event of a breach, what are the protocols followed? Are customers notified?

While this has never happened we perform quarterly tests to ensure our response to such incidents if/when they happen are well rehearsed. Our protocol is as follows:

 

  • In the event of a breach, all customers will be notified as soon as we are aware of the incident.
  • External access keys will be revoked and regenerated depending on the type of incident.
  • Clients may be required to reset their passwords.
  • All cloud credentials would be reset.
  • Data storage credentials would be changed.

 

We routinely test our applications for vulnerabilities using external auditors who use a hybrid method of penetration testing to monitor infrastructure to find out if a breach can occur and ensure that any loopholes are closed before a system reaches production.

What is the process to escalate outages to your team? What are the SLAs for response?

We categorise any reported issues into three different classes of severity and all reports are sent directly to an operations engineer that assesses and assigns the report. Outside of work hours alerts are rotated between team members every week and we operate systems to alert them of such incidents.

 

As production issues are the most common type of problem relating to downtime for any company, only operation engineers that are able to rectify these problems are assigned to out-of-hours support.

 

An example of our SLA structure is as follows and can be tailored on a customer to customer basis:

 

  • First Class
    • Response time of 120 minutes or less
    • Outages and downtime
    • Messages not sending
    • Security breaches
    • Regulatory, contractual or statutory compliance issues
  • Second Class
    • Response time of 6 hours or less
    • Malformed, corrupted or missing data
    • Performance issues
    • High Impact issues
  • Third Class
    • Response time of 48 hours or less
    • Product does not operate as designed but not to a level outlined in the first or second class severities

 

Release Management

We update our platforms frequently and updates occur with zero downtime, but we occasionally schedule maintenance periods for major updates. This is timed to minimise impact and will only happen when absolutely necessary. You will be notified about this in advance.