ZYNSTRA SERVER & INTELLIGENT CONTROL PLANE AVAILABILITY
What is the minimum level of availability that can be expected of a Zynstra server and the Intelligent Control Plane?
How is the severity and impact (and therefore priority) of an issue represented?
Before explaining what is meant by ‘availability’, it is important to understand how the priority of an issue is defined based on its severity and impact as the definition of availability relates to certain priorities of issue only.
When a Customer or Service Provider raises an issue with the Support Team, a priority is assigned to it based on its likely impact on the customer. The definitions used when determining priority are as follows:
- Priority 1 (P1) All of the customers users are unable to access (a) one or more critical software components on a server or (b) one or more Custom VMs on that server.
Priority 1 issues exclude situations in which a disaster (as defined later in this document) has occurred where a specific set of different service level commitments apply.
- Priority 2 (P2) Any of the customer’s users are unable to access a critical software component or one or more Custom VMs. In practice, P2 issues should occur infrequently as issues whose underlying cause is a failure of some kind within the servers software or hardware generally affect all users, not just one or some users.
Priority 2 issues exclude any issues which Customers/Service Providers should be able to resolve themselves using the various consoles provided with each server, including but not limited to user account management requests and issues such as password resets.
- Priority 3 (P3) Any users are experiencing an issue with one or more critical software components, but the issue does not materially impact their ability to work. The issue is an inconvenience (and possibly a major inconvenience) but the users can work around the issue for a short period.
- Priority 4 (P4) A request for a configuration change to the server’s software platform that only we can make (because of its complexity or side-effects) or a request for advice/support on usage best practice.
The ‘critical software components’ on a server are those which immediately affect any user’s ability to access the services and applications running on the server whose management is our responsibility. They include the Active Directory Domain Controller, Network Gateway, Security Gateway and Fileserver but they exclude, for example, the Local Backup Software and Cloud Backup Software, neither of which immediately prevent any user from working effectively. The critical software components also include any Custom Virtual Machines on the server but they exclude any applications or other services installed and maintained by the Service Provider or Customer in those Custom Virtual Machines - the responsibility for these lies with the Service Provider or Customer.
The ‘critical components’ of the management platform are the Commissioning Console, Monitoring Console and Central Database.
What do we mean by ‘Availability’ and ‘Service Loss’?
A server is deemed to be ‘available’ if it has not experienced a Priority 1 issue that has not yet been resolved. A Service Loss corresponds to any period of time at any time or the day or night on weekdays or at weekends (that is, 24 x 7) during which a server is not ‘available’ except for periods of time resulting from the following:
- Scheduled patches and upgrades that are periodically applied to all servers.
- Emergency security patches that very occasionally need to be distributed to servers at short notice.
- Factors beyond our reasonable control including issues caused by (a) the Customer, (b) other technology in the Customer’s infrastructure that interacts with the server, (c) third parties not contracted to us such as utility and dependent service providers that fail to provide continuous service (e.g. power, connectivity) or (d) natural disasters and force majeure.
How is the duration of a ‘Service Loss’ measured?
A Service Loss is deemed to have commenced at the earlier of (a) the Customer or Service Provider reporting the issue to our Support Team and (b) the time at which the Service Loss was detected by our automated monitoring capabilities. A Service Loss ends when all the critical software components have been restored to their correct working state and the Customer or Service Provider has been notified of this restoration of service.
SUPPORT SERVICE
What is the Support Service?
The Support Service is a collection of service-based commitments that we make in relation to a server. It defines the timescales in which we respond to incidents and the speed with which those services can be expected to be delivered.
What are the service commitments associated with the Support Service?
Zynstra offers two types of support – Standard Support and a Gold Managed Service. While the support, monitoring and management services offered vary between these two options, the commitment to support hours and issue response times are identical.
- Hardware component failure response and resolution time depends on the hardware support service purchased with the server
- As they are service-affecting, P1 and P2 issues must always be reported by phone even if reported by email or through the Support Portal as well. This is to ensure prompt action in the event of any IT or communications issues that could delay the receipt of email or Support Portal requests.
- P3 and P4 issues can be reported to the Support Team 24 x 7 by email, through the Support Portal or by phone. The Support Team will start to look into them the following working day and the resolution time targets mentioned above commence at 8am on the following day.
Hardware Component Failure Resolution Time
How quickly can Service Providers expect hardware failures to be resolved?
Please Note: The support team will only deal with hardware faults if the hardware was provided by Zynstra.
Hardware fault resolution is the process of restoring a server to its normal operating state following a hardware component failure. The service commitments associated with the process of hardware fault resolution apply only if the hardware has been supplied as an integral part of the subscription; that is, the hardware was not purchased separately.
Hardware component failures are resolved using the HPE Care Pack that is associated with the Support Service. The terms and conditions associated with hardware failure resolution are those published by HPE for its Care Packs.
The hardware component failure resolution time is measured from the earlier of (a) the time that the Customer or Service Provider reported the fault to us, and (b) the time that we detected the fault using our automated monitoring tools. It is deemed to have been resolved when the server becomes ‘available’ for use again.
DISASTER RECOVERY
Disaster recovery (or DR) is the process of restoring a server to its normal operating state after a disaster with the software and data resident on the server being in its state at the time that the most recently Cloud Backup of that software and data commenced.
Please note that the Disaster recovery service is an optional product, and not available to all customers.
1. Achievement of the target recovery point service commitment will depend on the rate of change of data during the days immediately preceding the disaster event and on the available upload bandwidth to the Internet.
2. Achievement of the Cloud Recovery Time target will depend on the volume of customer data (files, databases, etc) on the server at the time of the last Cloud backup that was completed prior to the disaster event and on the number of distinct custom Virtual Machines that are present on the server at this time.
Data Recovery Point and Recovery Time
What do we mean by a ‘disaster’?
A disaster is an event which destroys or damages beyond repair a server or renders it inaccessible indefinitely. Examples of disasters include theft, flood or fire.
What do we mean by ‘disaster recovery’?
Disaster recovery (or DR) is the process of restoring a server to its normal operating state after a disaster with the software and data resident on the server being in its state at the time that the most recently Cloud Backup of that software and data commenced.
What are the key steps in the ‘disaster recovery’ process?
The first step in the recovery process is the rapid restoration of the Customer’s software and data into the Cloud so that it is accessible to the Customer there. We perform this restoration if requested by the Service Provider and endeavours to complete it within the specified Cloud Recovery Time for the Support Service.
The second step in the recovery process is the full restoration of the Customer’s software and data on a new server on the Customer’s premises (which may differ to the original premises if the premises were also damaged during the disaster event).
- If the server hardware was supplied by us, we will restore a server of identical specification on the Customer’s chosen premises, pre-loaded with all the Customer’s software and data that were on the server at the time the most recently completed Cloud Backup commenced. We will endeavour to complete this within the specified On-Premise RTO based on the Support Service that is applicable.
- If the server hardware was not supplied by as an integral part of the subscription but was purchased by the Service Provider or Customer, the hardware owner is responsible for the supply of identical new hardware following the disaster from the same party from whom the original hardware was purchased. We will restore the Customer’s software and data to the new hardware based on the state of the software and data at the time that the most recently completed Cloud Backup commenced. No commitment to the On-Premises Recovery Time is offered in this case as the speed of restoration will depend on the speed of sourcing of the new server hardware and on the Customer’s available download bandwidth. This is because the restoration can be performed remotely (with much higher available internet download bandwidth) by us if we have supplied the hardware - this leads to a much faster restoration time.