First Trains runs inter-city train services on three main routes across the North of England. In December 2007 their rail network expanded to link Manchester with Glasgow and Edinburgh. Virgin trains also run across UK, they are one of the most experienced train companies in Britain, having been around for over 12 years.
At the end of October 2012, Virgin had decided to sell its train division on to potential buyers, with First purchasing the company for £460 million in cash and stock. By the end of 2013, First intends to have incorporated their networks and IT together, with the properties of Virgin being rebranded to the First logo and name.
First has purchased Virgin as they feels that this merger is beneficial to their business and will allow them to generate more revenue and profits by gaining the many profitable routes that Virgin previously provided to commuters.
This project acts as a proposal and strategy plan for the above merger between the two companies, First and Virgin. Both companies hope to merge their technologies and services as seamlessly as possible over to First's current branding, as well as their software and hardware. First expects to gain savings in regards to IT by continuing employment of only a small percentage of Virgin's IT staff.
Both companies had web solutions which allowed commuters access to service news, timetables, and the ability to purchase single, return, and season tickets for their train services at a discounted rate. The actual technologies underlying the services they provided though were different.
First Trains uses an ASP.NET MVC4 solution using C# as their primary programming language, and Razor for HTML and document generation. Being a Microsoft based application; MSSQL was used for all their relational database needs. The software was hosted on a cluster of Windows Server 2008 machines located in a datacenter in Manchester.
Virgin used a web solution written in PHP with all data storage stored in a MySQL database. They use this solution for cost-saving reasons and also host the software on Linux based servers running the high-performance web server, Nginx, to save on overall IT costs. The services were hosted in a datacenter located in Glasgow.
It has been decided by First, that a new project to consolidate both services would cost too much, and may suffer from delays or technical glitches, and may alienate their customer base that were already used to the previous system. Instead they have decided to incorporate functions and the customer data from Virgin's service to their own. The advantages and reasons for this will be outlined in the rest of this document taking into consideration security concerns, and future concerns.
The main priority and task of this project will be to migrate customer data from Virgin's systems and incorporate it into First's software. This may require software additions to First's MVC4 solution, and changes to their data model to allow for the data migration. It is hoped that the merger and software update will be mostly seamless to prevent disruption to customers.
The budget for the technological side of the merger stands at £530,000. Overall it will be cost-effective as First have their own employees to carry out the migration and maintenance to the software. The staff coming from Virgin may have to relocate and receive training with new technologies before they can help merge these two systems.
The scope of this document covers several things detailing how the project will be conducted and the terms and conditions that should be followed to complete the project. As this is a big project with lots of money involved, the first point that must be discussed are the technologies that both companies use currently and how First Trains will merge Virgin Train's data into First's existing web solution. Both companies use different technologies for web purposes and for their data and First understands that care should be taken to make sure that migrating Virgin's data with First's database should be done in such a way that retains data integrity.
Both web solutions are different in how users book tickets, and in how information is found on the website so the merging of these features has to be considered as well. The document will also cover the scalability, reliability, and detailed implementation of the project that may need to be referred to in the case that a problem occurs. In addition, the document also covers how First are expected to be able to complete this project on schedule by the projected deadline of December 2013 and the pre-emptive steps that should be taken to ensure the project stays on budget.
Choice of Platform
First Trains uses an ASP.NET MVC4 based solution. The acronym MVC stands for Model View Controller. The MVC framework allows for the separation of concerns, and for a software developer to build a web application using three roles: Model, View and Control. The MVC design pattern is a widely used and popular pattern, and used in an extensive range of programming languages.
Model View Controller design pattern
(M. Crowe, 2012, p.1)
The "Model" represents the data model of the project, this meaning the basic logical structure of stored data and the related classes. Since the whole point of MVC is a "separation of concerns", the model does not contain any data on how it should be displayed or manipulated. With the help of the ADO.NET Entity Framework (EF), it is possible to design a website by creating the data model first, using "Code First". EF also allows for the association and use of object orientated classes with relational databases. The EF is at version 5 currently and it supports many databases e.g. SQL Server, Oracle, DB2, etc.
A "View" is a collection of classes demonstrating the basics in the user interface. This refers to everything that a user can see and interact with, such as any available widgets, buttons, text boxes and more. Views are entirely based around the appearance of the application, and thus should not contain any application logic or any data recovery code. For example, instead it could be made to display data from the model and may be called by a controller method.
A "Controller" is what connects together the model and the view, allowing for these classes to interact with each other. The controller receives and based on the logic within its classes, will decide how to handle user input and communication, usually by starting a particular view with user inputted data. Controllers are good for test driven development and for large teams such as is the case with this project.
The MVC framework and MVC-based applications have a lot of advantages over forms-based applications. The advantages with MVC Based web applications can include:
Easier to manage a large-scale application using the model, view and controller.
The MVC framework is ideal for development amongst large teams and allows users to have full control of an application.
The framework provides good support for test driven development.
Form-based applications have their own advantages as well, which can include:
A page controller that adds functionality to every page.
By using view state it is possible to manage state information in multiple ways and also across post backs
Better for smaller teams, and faster to deploy as there is overall less code required and less complexity.
Cloud computing refers to applications and services offered over the Internet. It has the advantage of allowing users as service developers, to be equally unworried about the facilities behind the systems we develop. Consider that scalability, security and reliability are factors of underlying computer power and configuration and that the design make for a service will only ever allow for these factors if suspect that there might be a problem
There are several organisations that do have to be worried about available resources: scalability, security and reliability because they provide services that the users of the Internet have now taken for granted - Google, Amazon, Microsoft, IBM, Yahoo all need to maintain a large communications to suit their business needs. These companies have to make sure that their infrastructure is reliable or they will rapidly lose market share. It must be secure or their customers will stay away very quickly. (Anon., 2009)
Microsoft developed and operates a commercial application platform known as Windows Azure. This platform is used in many ways, such as simply being used for storage, being used to develop and run web applications or simply being used to run applications that require parallel processing either for efficiency or for the generation of data. Users of the platform can create virtual machines to develop and test applications locally before running it or deploying it to the cloud. The main advantage of using Windows Azure or any other cloud service is the scalability of applications running on the service.
For this project the primary technology that First has chosen to use is ASP.NET, using the MVC4 framework using C# as their primary programming language. The reason why this was chosen is due to MVC being very advantageous in web development, its backing by Microsoft, an extensive list of features, and because it is fitting for use in a large project. As well, it is more cost effective for First's existing solution to be modified rather than implement a new system, especially if that system was on a new platform that existing employees did not have experience with.
Razor will be used as it currently has been for HTML and document generation. Being a Microsoft based web application; MSSQL has and will be used for all the relational database needs because it is relatively easy to set up with the project and allows for real-time synchronization which is required as part of our scalability solution.
These cloud-based solutions, as well as the two existing solutions were what were considered as solutions to the merger. In the end, First found Virgin's solution to be inadequately managed, lacked source control, and while the solution was cost-effective, it was overall too simple in comparison. In addition, First's solution was only recently implemented prior to the merger, so management were relatively reluctant to discard the work and money spent.
As the system is technically already in place, the main task involved in this particular merger is the modification of the data model to support a transition of user data from Virgin's databases. In addition to that is implementing important or well loved features of the old website. Lastly there should be a short tutorial explaining important differences between the two web solutions and a mass email sent to users (that have given permission) informing them of this, and of when the merger is complete.
This is important to First, to not alienate or make the transition problematic for consumers, and thus extra care is being taken in regards to preventing disruption. In the rest of this section, some of the more important specifics are expanded on in greater detail.
Response, Load times, and Latency
Fast response times and load times are particularly important for the service. This importance is highlighted by a finding (R. Meyer , 2012) that internet users are very impatient in regards to waiting for a page to load. The finding shows that if a page does not load within 3 seconds, 57% of those consumers will abandon the website, 80% of which will not return.
With this in mind, it's important that response times and load times between servers and the clients are hopefully under most circumstances, under one second, while being under three seconds in cases of extreme or abnormal load. To ensure this the number of transactions required on an average page load is kept low, while the average bandwidth required for each page has a higher limit of 1 MB, but an average of 400 KB. Note that this does not include downloadable items such as large images, or PDF docs.
According to Ofcom research (Ofcom, 2012), they have found that as of May 2012, the UK average broadband speed is 9.0 Mbit/s. Theoretically this means that on the average UK connection, a user will be able to download an average page within 0.35 seconds, while a 1 MB page can be downloaded within 0.88 seconds. Based on the following assumptions:
Response time by servers are consistently low (no more than 0.1 seconds)
Latency with most users being UK based, and servers being based in a central location such as Manchester (no more than 0.1 seconds)
Noting the above, it should be perfectly possible to be well under the 1 second on average. Assuming the global average internet speed of 2.6 Mbit/s, a higher latency of 0.3 seconds, and a response time of 0.3 seconds, an average page should load within two seconds, well under the limit of 3 seconds.
All servers are kept in the same location, helping to ensure a low latency and response time between servers as well.
Client and Server Duties
With such a high profile website, with a large range of customers, it's important that of what work that can be done client side, is done client side, to alleviate stress and load on the server. It is important also to restrict the number of transactions between the client and the server. With this in mind, Ajax calls are restrictively used and typically only used during the payment process or the login page, or of those inserted automatically by the ASP.NET MVC4 framework and programming language that can't be helped.
The web solutions of First and Virgin did not vary in any major way, however each operated slightly different and in the case of Virgin, they had a journey planner application as part of their web solution. First's web solution is able to suggest a route or journey as part of the booking process, or a user can personally create their own by simply observing the timetables, but neither are really a comprehensive solution.
As a result, this was a feature often suggested by consumers, and the one First found most important to re-implement as part of the merged solution. The development is not seen to be particularly difficult or time consuming. Code used to generate routes in the booking process will be reused, with walking/travel routes between the each station and location generated using the Google maps API which will allow for the incorporation and display of maps.
It is expected that 2-3 maintainers of the pre-existing web solution will be re-allocated for implementing this feature and it is expected the feature will be rolled out prior to the deadline.
The primary task of this merger is to migrate data from Virgin's web solution where data is stored in a MySQL database, over to the existing implementation by First which uses a MSSQL database. This is not seen by First to be a major obstacle. Former employees of Virgin will oversee and contribute to the merger, as they will have significant knowledge of the limitations and structure of their own data model.
A plan to achieve this migration successfully has been drafted by the team. In brief, since the existing ASP.NET solution allows for relatively easy changes to the data model, additional fields or modifications to relationships are not difficult to implement in order to support new data. This is primarily because with the MVC framework, the data model is a separate concern from the rest of the solution (such as pages, and views). When the data model is changed, the solution can be setup to automatically regenerate the existing database with the new fields or relationships. Since both solutions were not drastically different from each other, First do not expect there to be many changes needed for the data model, the main concern is how their data model is structured and how this will be mapped to the existing solution.
First expect to overcome the previous concern by developing a script which will connect to the MySQL database. First it will import 1000 records at a time, unencrypting secure data, truncating or discarding fields which are not required, re-encrypting (but with the standards used by First) the secure data and inserting this into a hollow MSSQL database with the same data model as the pre-existing system. The new MSSQL database will be put through thorough testing to ensure data integrity and used with a test build of First's web solution on a test server.
At a later date, as long as any problems with the script have been weeded out, Virgin's web solution will be taken down for maintenance from 12AM of that day, a new data migration using the script will occur to ensure up to date information. Lastly using MSSQL's synchronization features, the outputted database will be merged into the live database. After some preliminary testing, both First and Virgin's websites will be live again to the public. It is hoped that the transition will take no longer than 6 hours, so that there is little disruption and the site goes live before normal day/working hours.
Most possible risks are seen to be in the short-term, particularly for the day of transition. This includes:
The data migration could fail, or data integrity could be compromised.
Load caused by visitors may surpass the available power provided by servers.
Customers do not like the new website, or the differences confuse them.
The deadline could be delayed.
Steps have been taken to ensure the above do not happen, however they are still cause for concern and contingency plans have been put in place. In regards to if data migration fails, a backup database will be used and Virgin's website will continue to run the old software. In the case of load, there will be backup servers available for if the estimated load has been misjudged.
A problem in huge projects like this one are mostly related to project managers being overwhelmed because of a lack of good processes. If project managers don't have sufficient planning processes, estimating processes, scheduling processes, scope change processes, risk management processes, etc this could end up in a failure. This could also cost the company more money than the estimated budget. First Train Company should also research and gather information before the merging of this project, to know about the hardware/software.
The concept of reliability is that which can prove to be of extreme importance for any application or website, especially of those which are involved in e-commerce or ones which users are dependant on. In a project such as this one, this holds true and further errors in reliability can cause productivity to decline dramatically, and reputations to suffer. The better the reliability of the service, the more productive and cost-effective it can be for the developers and the company. As a result for most websites including this one, reliability is of utmost importance.
To help in the subject of reliability, the site is hosted on a variety of servers, with database servers being synchronised in real-time amongst each other. At the front of this infrastructure, three load balancers submit requests to each available server to spread the load. More than one is required because only having one wouldn't ensure reliability as there would be a single point of failure.
(W. Yoo, 2008)
When a server becomes unavailable, it will automatically recognise this and stop submitting requests to the server, and will contact an administrator. It will also do this if the load reaches a particular percentage over the range of servers.
A goal of this project is to upgrade the system to be able to support more concurrent users as the merging of Virgin Trains into First; this project will need to meet current and future needs for more customers and provide them with better train facilities. first thing is to look for is that data that will be transfer from virgin's train company to first train company .
With a booking system, comes the requirement for a payment system, the storage and transfer of payment details and more. The details of customers and their payment details are extremely sensitive data, and thus extra steps need to be taken to protect them. If they were ever to be compromised it would greatly damage the company's reputation.
Security has already been addressed in the system that First already had in place using various functions provided by the ASP.NET language and the MVC4 framework, as well as physical and in-code precautions. In the new proposed system little would be changed as the current precautions are seen to be sufficient.
Precautions include all sensitive payment data being encrypted with a salted encryption using AES to a database on a barebone and separate secure server. The data is stored in a separate server in hope that those details will still be safe in the event that the primary servers hosting the web solution are compromised. CVV2 codes are not stored.
In the web forms for payments, there are small precautions made. For example payment information is cleared from the form when navigating back. Full credit card numbers and CVV2 codes are never displayed to the user by the webpage.
Features of ASP.NET are being taken advantage of. For example the usage of parameters such as [Authorize] and [AllowAnonymous] have been used. [Authorize] is an easy way to instruct in-code that the page first forces a user to login before being able to access or view a page. [AllowAnonymous] is again an easy way to allow anonymous users to view a page. Both of these parameters can be inserted in code, or in a particular controller.
Most of the website is accessible without the use of https/ssl. Login, payment and booking pages will have this forced, in which case a parameter [RequireHttps] is used within the controller. Users can force the usage of https/ssl if they for some reason require it or want it but it is not the default function unless they have logged in. While internal testing found there to be little downside to securing the entire website, of which only performance is the main concern, there is unfortunately not very much good reason to do so either, except where sensitive data is being transmitted. Performance is unlikely to take a significant dip anyway, of which the main concern is only the initial handshake.
The usage of claims-based authentication is also used. Tokens are traded in which they contain the username, and group of a user. For example the user "administrator" has access to a special admin panel, while users belonging to the group "maintainer" can amend timetables and data on the website. Every other user and registered user is treated as "user". This is not used to its full potential but at a later date it means it's possible to add more groups with different access rights if it's for some reason required.
Scalability is the ability for a system to be able to accommodate and handle a significant amount of work or growing amount of activity. In this case this section is here to describe how this project handles spikes in user traffic, and growing amount of usage over time. A system that can scale well, should be able to maintain a similar or the same performance as usage grows or spikes. A lot of applications are unable to do this well, and as usage grows, failures and errors become increasingly common.
The current solution provided for reliability also allows for a high rate of scalability as well. With load balancers at the front of the infrastructure, it is possible to handle a significant number of concurrent connections, and spread out these users amongst a high number of IIS-based web servers. In the case that load is beginning to exceed expectations, or begins to detrimentally effect the service, it is possible for more content servers to be added to decrease overall load. Clone images of each revision of the content servers are kept so that in the eventuality that this happens or hardware fails, the time to deploy another server to alleviate load will take a short amount of time.
First has faith that with the backing of Microsoft that their technologies such as IIS, ASP.NET, and MSSQL all offer competitive scalable performance. Internal and real world testing (as the implementation was already in place) confirms this belief. To ensure that the new system can handle the increased load, a scenario network was setup, and an artificial denial of service attack was used on the system. The attack was used to simulate what was expected to be the maximum expected users that would use the website at any given time, and also to simulate high spikes in usage. The scenario system held up well against the attack, and continuing to be able to respond under the three seconds load time goal.
With this knowledge First are confident the real implementation will be scalable enough on the existing hardware.
In choosing and implementing this system, a number of considerations were made. For example, the MVC framework was chosen in part because it is naturally versatile to change, so in the case that new guidelines or website designs become common, not the whole solution needs to be re-implemented, but rather only the "views" need to be modified. Similarly if the structure or flow of the website needs to change, then only the controllers (and maybe by extension the views) need to be modified.
Even in the case that the ASP.NET MVC framework is to be updated or releases a new revision, it's relatively easy to port a project over to a new version, and to-date Microsoft has released whitepapers detailing how to do this with each major release.
In regards to hardware, the current status of the existing servers will be revised on a bi-annual basis. They will be evaluated to find if it is cost-effective to upgrade or replace existing servers, either because technological advancements allow the company to do "more with less", or because increase in usage is leading to a high load that may detrimentally affect the user experience. It is possible for the IT dept to stage "emergency evaluations" if for some reason it is deemed critical to hold the evaluation sooner, or because of performance issues caused by a sharp rise in usage.
First feels that the decisions made in this merger have satisfied all the requirements, budget constraints and expects it to conclude smoothly. The web solution has been cost-effective as an existing solution has been used and new features have been developed in-house by existing employees.
Major expenses as a result of the merger have been the employment of former Virgin IT staff, of which 30% were recruited. Where possible, employees with knowledge of ASP.NET and C# had a higher priority of recruitment, but of those that didn't, training has been provided. Extra load is expected with a significant rise in usage, but fortunately savings have been made here by repurposing servers that Virgin owned.
In conclusion, this document represents a feasible and effective solution.