Microsoft describes Windows Azure as an “operating system for the cloud.” But what exactly is the “cloud”
At its core, cloud computing is the realization of the long-held dream of utility computing. The “cloud” is a metaphor for the Internet, derived from a common representation in computer network drawings showing the Internet as a cloud. Utility computing is a concept that entails having access to computing resources, and paying for the use of those resources on a metered basis, similar to paying for common utilities such as water, electricity, and telephone service.
In short, the fabric controller performs the following key tasks:
nodes talking to a backend node through a specific port, the fabric controller can ensure that the topology always holds up. In case of a failure, it deploys the right binaries on a new node, and brings the service model back to its correct state.
History of Cloud Computing
The history of cloud computing includes utilization of the concept in a variety of environments, including the following:
- Time-sharing systems
- Mainframe computing systems
- Transactional computing systems
- Grid computing systems
Time-sharing systems:
Cloud computing has its origins in the 1960s. Time-sharing systems were the first to offer a shared resource to the programmer. Before time-sharing systems, programmers typed in code using punch cards or tape, and submitted the cards or tape to a machine that executed jobs synchronously, one after another. This was massively inefficient, since the computer was subjected to a lot of idle time.
Bob Bemer, an IBM computer scientist, proposed the idea of time sharing as part of an article in Automatic Control Magazine. Time sharing took advantage of the time the processor spent waiting for I/O, and allocated these slices of time to other users. Since multiple users were dealt with at the same time, these systems were required to maintain the state of each user and each program, and to switch between them quickly. Though today’s machines accomplish this effortlessly, it took some time before computers had the speed and size in core memory to support this new approach.
Bob Bemer, an IBM computer scientist, proposed the idea of time sharing as part of an article in Automatic Control Magazine. Time sharing took advantage of the time the processor spent waiting for I/O, and allocated these slices of time to other users. Since multiple users were dealt with at the same time, these systems were required to maintain the state of each user and each program, and to switch between them quickly. Though today’s machines accomplish this effortlessly, it took some time before computers had the speed and size in core memory to support this new approach.
Mainframe computing:
Though nearly outdated today, mainframe computing innovated several of the ideas you see in cloud computing. These large, monolithic systems were characterized by high computation speed, redundancy built into their internal systems, and generally delivering high reliability and availability. Mainframe systems were also early innovators of a technology that has resurged over the past few years: virtualization.
Mainframe computing and cloud computing are similar in the idea that you have a centralized resource (in the case of cloud computing, a data center) that is too expensive for most companies to buy and maintain, but is affordable to lease or rent resources from. Data centers
represent investments that only a few companies can make, and smaller companies rent
resources from the companies that can afford them.
Transactional computing:
Transactional systems are the underpinning of most modern services. The technology behind transactional systems is instrumental in modern cloud services. Transactional systems allow processing to be split into individual, indivisible operations called transactions. Each transaction is atomic—it either succeeds as a whole or fails as a whole. Transactions are a fundamental part of every modern database system.
While database systems were emerging, several significant innovations were happening
in the transaction processing space. One of the first few systems with transaction processing
capabilities was IBM’s Information Management System (IMS).
IMS was a joint hierarchical database and information management system with transaction
processing capabilities. It had several of the features now taken for granted in modern systems: Atomicity, Consistency, Isolation, Durability (ACID) support; device independence; and so on. Somewhat surprisingly, IMS has stayed strong over the ages, and is still in widespread use.
IBM also contributed another important project to transaction processing: System R. System R was the first SQL implementation that provided good transaction processing performance. System R performed breakthrough work in several important areas: query optimization, locking systems, transaction isolation, storing the system catalog in a relational form inside the database itself, and so on.
Grid computing:
The term grid computing originated in the 1990s, and referred to making computers accessible in a manner similar to a power grid. This sounds a lot like cloud computing, and reflects the overlap between the two, with some companies even using the terms interchangeably.
The cloud allows you to run workloads similar to a grid. When you have data that must be processed you spin up the required number of machines, split the data across the machines in any number of ways, and aggregate the results together.
In a basic grid computing system, every computer can access the resources of every other computer belonging to the network.
Grid computing is focused on the ability to support computation across multiple administrative domains that sets it apart from traditional distributed computing. Grids offer a way of using the information technology resources optimally inside an organization involving virtualization of computing resources. Its concept of support for multiple administrative policies and security authentication and authorization mechanisms enables it to be distributed over a local, metropolitan, or wide-area network.
Grid computing is focused on the ability to support computation across multiple administrative domains that sets it apart from traditional distributed computing. Grids offer a way of using the information technology resources optimally inside an organization involving virtualization of computing resources. Its concept of support for multiple administrative policies and security authentication and authorization mechanisms enables it to be distributed over a local, metropolitan, or wide-area network.
Distributed computing:
Distributed Computing is an environment in which a group of independent and geographically dispersed computer systems take part to solve a complex problem, each by solving a part of solution and then combining the result from all computers. These systems are loosely coupled systems coordinately working for a common goal. It can be defined as
- A computing system in which services are provided by a pool of computers collaborating over a network .
- A computing environment that may involve computers of differing architectures and data representation formats that share data and system resources.
Understanding Cloud Services
Infrastructure-as-a-Service (IaaS):
This refers to services that provide lower levels of the stack. They typically provide basic hardware as a service—things such as virtual machines, load-balancer settings, and network attached storage. Amazon Web Services (AWS) and GoGrid fall into this category.
Platform-as-a-service (PaaS):
Providers such as Windows Azure and Google App Engine (GAE) provide a platform that users write to. In this case, the term platform refers to something that abstracts away the lower levels of the stack. This application runs in a specialized environment. This environment is sometimes restricted—running as a lowprivilege process, with restrictions on writing to the local disk and so on. Platform providers also provide abstractions around services (such as email, distributed caches, structured storage), and provide bindings for various languages. In the case of GAE, users write code in a subset of Python, which executes inside a custom hosting environment in Google’s data centers.
Software-as-a-Service (SaaS):
The canonical example of this model is Salesforce.com. Here, specific provided applications can be accessed from anywhere. Instead of hosting applications such as Customer Relationship Management (CRM), Enterprise Resource Planning (ERP), and Human Resources (HR) on-site, companies can outsource these applications.
The Windows Azure Platform
The Windows Azure Platform is a group of cloud technologies to be used by applications running in Microsoft’s data centers, on-premises and on various devices.
Azure AppFabric:
Azure AppFabric services provide typical infrastructure services required by both on premises
and cloud applications. These services act at a higher level of the “stack” than Windows Azure (which you’ll learn about shortly). Most of these services can be accessed through a public HTTP REST API, and hence can be used by applications running on Windows Azure, as well as your applications running outside Microsoft’s data centers.
Following are the components of the Windows Azure AppFabric platform:
Service Bus:
Hooking up services that live in different networks is tricky. There are several issues to work through: firewalls, network hardware, and so on. The Service Bus component of Windows Azure AppFabric is meant to deal with this problem. It allows applications to expose Windows Communication Foundation (WCF) endpoints that can be accessed from “outside” (that is, from another application not running inside the same location). Applications can expose service endpoints as public HTTP URLs that can be accessed from anywhere.
Access Control:
This service lets you use federated authentication for your service based on a claims based,
RESTful model. It also integrates with Active Directory Federation Services, letting you integrate with enterprise/on-premises applications.
SQL Azure:
SQL Azure is SQL Server hosted in the cloud. It provides relational database features, but does it on a platform that is scalable, highly available, and load-balanced. Most importantly, unlike SQL Server, it is provided on a pay-as-you-go model, so there are no capital fees upfront (such as for hardware and licensing).
As you’ll see shortly, there are several similarities between SQL Azure and the table services provided by Windows Azure. They both are scalable, reliable services hosted in Microsoft data centers. They both support a pay-for-usage model. The fundamental differences come down to what each system was designed to do.
Windows Azure:
Windows Azure is Microsoft’s platform for running applications in the cloud. You get on-demand computing and storage to host, scale, and manage web applications through Microsoft data centers.
Virtualization:
At the bottom of the Windows Azure stack, you’ll find a lot of machines in Microsoft data centers. These are state-of-the-art data centers with efficient power usage, beefy bandwidth, and cooling systems. Even the most efficient facilities still possess a lot of room for overhead and waste when it comes to utilization of resources. Since the biggest source of data center cost is power, this is typically measured in performance/watts/ dollars.
Windows Azure has its own hypervisor built from scratch and optimized for cloud services. In practice this means that, since Microsoft controls the specific hardware in its data centers, this hypervisor can make use of specific hardware enhancements that a generic hypervisor targeted at a wide range of hardware (and a heterogeneous environment) cannot. This hypervisor is efficient, has a small footprint, and is tightly integrated with the kernel of the operating system running on top of it.
The Fabric Controller:
Imagine that you’re describing your service architecture to a colleague. You probably walk up to the whiteboard and draw some boxes to refer to your individual machines, and sketch in some arrows. In the real world, you spend a lot of time implementing this diagram. You first set up some machines on which you install your bits. You deploy the various pieces of software to the right machines. You set up various networking settings: firewalls, virtual LANs, load balancers, and so on. You also set up monitoring systems to be alerted when a node goes down.
In short, the fabric controller performs the following key tasks:
Hardware management:
The fabric controller manages the low-level hardware in the data center. It provisions and monitors, and takes corrective actions when things go wrong. The hardware it manages ranges from nodes to TOR/L2 switches, load balancers, routers, and other network elements. When the fabric controller detects a problem, it tries to perform corrective actions. If that isn’t possible, it takes the hardware out of the pool and gets a human operator to investigate it.Service modeling:
The fabric controller maps declarative service specifications (the written down, logical version of the whiteboard diagrams mentioned at the beginning of this section) and maps them to physical hardware. This is the key task performed by thefabric controller. If you grok this, you grok the fabric controller. The service modeloutlines the topology of the service, and specifies the various roles and how they’re connected, right down to the last precise granular detail. The fabric controller canthen maintain this model. For example, if you specify that you have three frontendnodes talking to a backend node through a specific port, the fabric controller can ensure that the topology always holds up. In case of a failure, it deploys the right binaries on a new node, and brings the service model back to its correct state.
Operating system management:
The fabric controller takes care of patching the operating systems that run on these nodes, and does so in a manner that lets your service stay up.Service life cycle:
The fabric controller also automates various parts of the service life cycle—things such as updates and configuration changes. You can partition your application into sections (update domains and fault domains), and the fabric controller updates only one domain at a time, ensuring that your service stays up. If you’re pushing new configuration changes, it brings down one section of your machines and updates them, then moves on to the next set, and so on, ensuring that your service stays up throughout.Storage
If you think of Windows Azure as being similar to an operating system, the storage services are analogous to its filesystem. Normal storage solutions don’t always work very well in a highly scalable, scale-out (rather than scale-up) cloud environment. This is what pushed Google to develop systems such as BigTable and Google File System, and Amazon to develop Dynamo and to later offer S3 and SimpleDb.
Windows Azure offers three key data services: blobs, tables, and queues. All of these services are highly scalable, distributed, and reliable. All of the services detailed here are available over HTTP through a simple REST API, and can be accessed from outside Microsoft’s data centers as well. Like everything else in Azure, you pay only for what you use and what you store.
Blob storage:
The blob storage service provides a simple interface for storing named files along with metadata. Files can be up to 1 TB in size, and there is almost no limit to the number you can store or the total storage available to you. You can also chop uploads into smaller sections, which makes uploading large files much easier.
Here is some sample Python code to give you a taste of how you’d access a blob using the API. This uses the unofficial library from http://github.com/sriramk/winazurestorage/.
blobs = BlobStorage(HOST,ACCOUNT,SECRET_KEY)
blobs.create_container("testcontainer", False)
blobs.put_blob("testcontainer","test","Hello World!" )
The Windows Azure table storage service provides the same kind of capability. You can create massively scalable tables (billions of rows, and it scales along with traffic). The data in these tables is replicated to ensure that no data is lost in the case of hardware failure. Data is stored in the form of entities, each of which has a set of properties. This is similar to (but not the same as) a database table and column. You control how the data is partitioned using PartitionKeys and RowKeys. By partitioning across as many machines as possible, you help query performance.
You may be wondering what language you use to query this service. If you’re in the .NET world, you can write Language Integrated Query (LINQ) code, and your code will look similar to LINQ queries you’d write against other data stores. If you’re coding in Python or Ruby or some other non-.NET environment, you have an HTTP API where you can encode simple queries. If you’re familiar with ADO.NET Data Services (previously called Astoria), you’ll be happy to hear that this is just a normal ADO.NET Data Service API.
Queue service:
The queue service provides reliable storage and delivery of messages for your application. You’ll typically use it to hook up the various components of your application, and not have to build your own messaging system. You can send an unlimited number of messages, and you are guaranteed reliable delivery. You can also control the lifetime of the message. You can decide exactly when you’re finished processing the message and remove it from the queue. Since this service is available over the public HTTP API, you can use it for applications running on your own premises as well.Table storage:
The table storage service is arguably the most interesting of all the storage services. Almost every application needs some form of structured storage. Traditionally, this is through a relational database management system (RDBMS) such as Oracle, SQL Server, MySQL, and the like.The Windows Azure table storage service provides the same kind of capability. You can create massively scalable tables (billions of rows, and it scales along with traffic). The data in these tables is replicated to ensure that no data is lost in the case of hardware failure. Data is stored in the form of entities, each of which has a set of properties. This is similar to (but not the same as) a database table and column. You control how the data is partitioned using PartitionKeys and RowKeys. By partitioning across as many machines as possible, you help query performance.
You may be wondering what language you use to query this service. If you’re in the .NET world, you can write Language Integrated Query (LINQ) code, and your code will look similar to LINQ queries you’d write against other data stores. If you’re coding in Python or Ruby or some other non-.NET environment, you have an HTTP API where you can encode simple queries. If you’re familiar with ADO.NET Data Services (previously called Astoria), you’ll be happy to hear that this is just a normal ADO.NET Data Service API.
The Data Centers
A data center or computer centre (also datacenter) is a facility used to house computer systems and associated components, such as telecommunications and storage systems. It generally includes redundant or backup power supplies, redundant data communications connections, environmental controls (e.g., air conditioning, fire suppression) and security devices.
Data centers are where all the action is as far as Windows Azure is concerned. Windows Azure physically runs in several of Microsoft’s data centers around the world. Like all other major companies building and running data centers, Microsoft likes to keep certain information about them close to the vest, so this section may seem light on details in some areas.
The first data centers were similar to the ones you’ve probably seen in many offices. They had cables running under raised floors, some enhanced security, and environmental controls, but they weren’t meant to run services at the massive scale required today.
The Hypervisor
To create several virtual machines on one physical machine, a thin piece of low-level system software called the hypervisor or Virtual Machine Monitor (VMM) is used. The hypervisor is responsible for fair allocation of resources between the various virtual machines. It schedules CPU time and I/O requests, and provides for isolation between the various virtual machines.
The Fabric
Consider how the operating system and programming frameworks abstract over memory. Instead of having to deal with individual RAM cells, the operating system and your programming platform provide several abstractions on top:
• With raw physical memory, programmers can allocate memory and deal with memory addresses, rather than individual memory cells.
With virtual memory and paging, developers can ignore the actual physical memory limits on the machine, and not worry about trampling over memory used by other processes in the system.
• With garbage collection, programmers don’t have to worry about allocating or freeing memory, since that is done automatically for them.
• With the hot-adding of RAM, use of modern operating systems may allow for the addition of extra memory on-the-fly without having to shut down the machine.
The fabric itself is a massive, distributed application that runs across all of Windows Azure’s machines. Instead of having to deal with thousands of machines individually, other parts of Windows Azure (and the users of Windows Azure) can treat the entire set of machines as one common resource managed by the fabric.
The fabric itself is a massive, distributed application that runs across all of Windows Azure’s machines. Instead of having to deal with thousands of machines individually, other parts of Windows Azure (and the users of Windows Azure) can treat the entire set of machines as one common resource managed by the fabric.
The Fabric Controller:
Though fabric code is running on all machines in the Windows Azure world, almost all the heavy lifting is done by a small core set of machines known as the fabric controller. The fabric controller is often called “the brain” of Windows Azure, and for good reason: it controls the operation of all the other machines, as well as the services running on them.
The fabric controller is responsible for the following key tasks:
- The fabric controller “owns” all the data center hardware. This ranges from normal machines on racks to load balancers, switches, and so on. The fabric controller knows the state of each of them, and can detect failures on them (at which time, a human being is notified if the failure can’t be fixed programmatically).
- The fabric controller makes provisioning decisions. The fabric controller’s inventory of machines is dynamic, and can be added to/removed easily.
- The fabric controller maintains the health of all the services. It monitors the servicesand tries to take corrective action upon detection of a failure. It also deals with upgrades and configuration changes to services.
Coding and Modeling:
In the Windows Azure world, you model your service and produce a service model with such things as worker roles, web roles, service configuration, and so on. These are all nothing more than ways to define the topology of your service.
That is exactly what the service model does for you.
In Windows Azure, each service (from the simplest “Hello World” website to the most complex services) has an associated service model. This service model is just a giant XML file that contains the same elements that your whiteboard diagram does. However, it describes them using well-defined elements such as roles, endpoints, The Fabric configuration settings, and so on. The service model defines what roles your service contains, what HTTP or HTTPS endpoints they listen on, what specific configuration settings you expect, and so on.
For example, if the service model specifies a simple ASP.NET website with a single .aspx page, configured to listen on http://foo.cloudapp.net, the fabric controller transforms them into a set of procedural actions such as “bring up virtual machine,” “copy bits to Machine X,” “configure load balancer,” “provision DNS,” and so on. Specifying things in a model-driven manner not only is easier and less error-prone, but also frees up the fabric controller to optimize tasks and parallelize execution of various such tasks across the system.
Service configuration files and service models
Instead of forcing users to deal directly with the complexity of creating service models, Windows Azure ships with a few service model templates that users can use. The web role, the worker role, and the CGI role are all examples of these templates. When you build a package, the CSPack tool uses your code and your service configuration to generate service models for your applications based on these templates.
<componentports>
<inPort name="HttpIn" protocol="http">
<inToChannel>
<lBChannelMoniker
name="/HelloFabric/HelloFabricGroup/FELoadBalancerHttp
In" />
</inToChannel>
</inPort>
</componentports>
<settings>
<aCS name="WebRole:BannerText" defaultValue="">
<maps>
<mapMoniker
name="/HelloFabric/HelloFabricGroup/MapWebRole:BannerTe
xt"
/>
</maps>
</aCS>
... (more XML follows)
For example, if the service model specifies a simple ASP.NET website with a single .aspx page, configured to listen on http://foo.cloudapp.net, the fabric controller transforms them into a set of procedural actions such as “bring up virtual machine,” “copy bits to Machine X,” “configure load balancer,” “provision DNS,” and so on. Specifying things in a model-driven manner not only is easier and less error-prone, but also frees up the fabric controller to optimize tasks and parallelize execution of various such tasks across the system.
Service configuration files and service models
Instead of forcing users to deal directly with the complexity of creating service models, Windows Azure ships with a few service model templates that users can use. The web role, the worker role, and the CGI role are all examples of these templates. When you build a package, the CSPack tool uses your code and your service configuration to generate service models for your applications based on these templates.
<componentports>
<inPort name="HttpIn" protocol="http">
<inToChannel>
<lBChannelMoniker
name="/HelloFabric/HelloFabricGroup/FELoadBalancerHttp
In" />
</inToChannel>
</inPort>
</componentports>
<settings>
<aCS name="WebRole:BannerText" defaultValue="">
<maps>
<mapMoniker
name="/HelloFabric/HelloFabricGroup/MapWebRole:BannerTe
xt"
/>
</maps>
</aCS>
... (more XML follows)
Provisioning and Deployment
Once the package reaches the fabric controller (that is, the .cspkgx file is uploaded), the fabric controller first tries to find a home for the various role instances that make up the service. This is essentially a constraint-solving problem—the service model expresses constraints such as the number of role instances, the fault domains required, the local disk and machine size required, and so on. The fabric controller looks across its machine pool and finds the right set of nodes to act as homes for these role instances, based on these constraints.
How this is accomplished behind the scenes is fairly interesting. Following is a brief
synopsis:
1. In the beginning, all servers have nothing on them, and the fabric controller powers them on programmatically.
2. Each server is configured to boot from the network using normal Preboot Execution Environment (PXE) requests. (PXE is a standard mechanism to boot computers using a network interface.) It downloads a small operating system image called the maintenance OS and boots into it. (That name is a bit of a misnomer, since it actually doesn’t do any “maintenance” per se; rather, it bootstraps the machine and is the first bit of fabric code that runs on the target node.) All communication with the maintenance OS is through a fabric agent that lives on the maintenance OS.
3. The maintenance OS talks to the fabric controller. After some secure handshaking, it sets up the host partition. Remember that of the various partitions on the hypervisor, the host/root partition is special and has the capability to directly talk to the hardware.
4. The maintenance OS pulls down a VHD with the operating system for the host partition. Currently, this is based on Windows Server Core, the lightweight, minimal, stripped-down version of Windows Server 2008. The maintenance OS restarts the machine to boot into the host OS. This is done using a “boot-from-VHD” feature, which, as mentioned previously, is now available as a part of Windows 7.
5. When the host partition/virtual machine starts up, it has an agent that can talk to the fabric controller as well. The fabric controller tells the agent how many guest partitions to set up, and what VHDs to download. These are cached so that this download must happen only the first time.
6. For each guest partition, there’s a base VHD that contains the operating system, and a differencing disk that contains any changes to disk. This is a standard practice used with virtual machines that lets you destroy the differencing disk to get back to the pristine base VHD state. In the case of Windows Azure, resetting a guest virtual machine to its original state is as simple as deleting the differencing disk. These guest VHDs contain a version of Windows Server 2008 Enterprise with modifications to integrate with the Windows Azure hypervisor.
7. Once the guest virtual machine is up and running, the specified role instance’s files are copied onto the machine. Depending on the kind of role, different actions are taken. For example, a web role is launched in an http.sys-based web hosting environment, while a worker role is launched similar to a normal program. Currently, there is a strict enforcement of one role instance to one virtual machine. This helps ensure that the virtual machine is an isolation boundary both from a security and a resource utilization perspective.
8. The fabric controller repeats this process for each role instance in the service model. Typically, only the last step must be performed, because most machines would have had the initial bootstrapping done already.
9. The fabric controller programs the load balancer and other network hardware to route traffic from the external address assigned to the service (say, for example, foo.cloudapp.net) to the individual role instances. The role instances are placed behind a load balancer, which performs a simple round-robin algorithm to route traffic between the instances. Once the right networking routes are in place, traffic from the outside world flows to and from the role instances.
First Cloud App Using Visual Studio Tools
Signing Up for Windows Azure
The Windows Azure Developer Portal is a one-stop shop for all your service management and Windows Azure needs. The portal contains everything from all the projects/storage accounts underneath your account, to links to billing information. To sign up, head to http://windows.azure.com. You’ll be asked to create an account by signing in with your Live ID credentials and providing billing information. The entire account creation process is quick and painless.
Getting and Installing the Tools
You can use the following two primary tools to develop for Windows Azure:
• Windows Azure Software Development Kit (SDK)
• Windows Azure Tools for Visual Studio (which is bundled with the SDK as well)
The Windows Azure SDK is a free download that contains CSPack (used for packaging your applications), the Development Fabric, and other essential tools needed for building applications for Windows Azure. (You’ll learn what these tools do later in this chapter.) The SDK typically has a new version released every few months. You can find a link to the latest SDK download at http://www.microsoft.com/azure.
Installing these prerequisites separately can be a hassle. One easy way to install all of them with one tool is through the Microsoft Web Platform installer, available from http://www.microsoft.com/web. Figure 3-1 shows the Web Platform installer with the Windows Azure tools (an early version) set to be installed. Note how the necessary dependencies are detected and installed.
Installing these prerequisites separately can be a hassle. One easy way to install all of them with one tool is through the Microsoft Web Platform installer, available from http://www.microsoft.com/web. Figure 3-1 shows the Web Platform installer with the Windows Azure tools (an early version) set to be installed. Note how the necessary dependencies are detected and installed.
Getting to Know the SDK and Tools
If everything installed correctly, you should see a variety of new items in your Start menu, as shown in Figure 3-2. (Note that this figure is reproduced from an early build of the software, and you’ll almost surely see a different icon.)
Understanding the Development Fabric
On the actual cloud, your code runs on different virtual and physical machines. Since launching a virtual machine on a normal physical (development) machine requires a lot of overhead, the Development Fabric (Dev Fabric) launches different processes instead. For every virtual machine you’ll see launched in the cloud, the Dev Fabric launches a local process called RdRoleHost.exe to host your application. Since these are just normal processes, you can attach a debugger to them, and perform typical debugging tasks (such as setting breakpoints, inspecting and changing values, stepping through code, and so on). In fact, this is how the Visual Studio extensions provide debugging support; they attach to the processes launched by the Dev Fabric.
Dev Fabric providing a simulation of the Windows Azure fabric on a local machine. In short, the Dev Fabric enables a developer to build, debug, and test code locally before deploying to the actual cloud
Dev Fabric providing a simulation of the Windows Azure fabric on a local machine. In short, the Dev Fabric enables a developer to build, debug, and test code locally before deploying to the actual cloud
Development Storage
While the Dev Fabric simulates the Windows Azure fabric and is used for hosting code, the Development Storage (Dev Storage) part of the SDK is used to simulate the Windows Azure storage services—blobs, tables, and queues. It does this by using a local SQL Server instance as a backing store, and providing a local running instance of the Windows Azure blobs, tables, and queues.
Developing Your First Cloud Application
Understanding Windows Azure Roles
Windows Azure takes this informal grouping of machines that most applications do and formalizes it into something called roles. A Windows Azure role roughly corresponds to a “type” of box. However, each role is tweaked for a special purpose.
Table 4-1. Windows Azure role and role template types
Role type Description
Web role This is analogous to an ASP.NET website hosted in IIS (which is, in fact, exactly how Windows Azure hosts your code, as you saw in Chapter 2). This is your go-to option for hosting websites,web services, and anything that needs to speak HTTP, and can run on the IIS/ASP.NET stack.
Worker role A worker role in Windows Azure fulfills the same role a long-running Windows service/cron job/console application would do in the server world. You get to write the equivalent of an int main() that Windows Azure will call for you. You can put absolutely any code you want in it. If you can’t fit your code in any of the other role types, you can probably find a way to fit it here. This is used for everything, including background jobs, asynchronous processing, hosting application servers written in non-.NET languages such as Java, or even databases such as MySQL.
CGI web role (web role) Windows Azure offers direct support to host languages and runtimes that support the FastCGI protocol. The CGI Role template makes it easier for you. Though this is offered as a first-class role type in Visual Studio, under the covers it is just a web role with the CGI option turned on.
WCF service role (web role) This is another customized version of the web role targeted at hosting WCF Services. Under the covers, this is just a web role with some Visual Studio magic to make it easier to write a WCF service.
Writing the Code for "WEB Role"
This directory will be the root of your website—the equivalent to inetpub/wwwroot in IIS, or the root of your ASP.NET application. Any content you put in here will “hang” off the root URL of your website.
Let’s now create the contents of the website. Create a new file called index.html in the htmlwebsite directory with the contents shown in Example 3-1. (Since this is just a normal HTML page, you can put any valid HTML content you want in here.)
Example 3-1. Über-complex web page
<html>
<body>
Hello World!
</body>
</html>
To get Windows Azure to run this trivial site, you must provide two pieces of metadata: the service definition and the service configuration. These are stored in two XML files called ServiceDefinition.csdef and ServiceConfiguration.cscfg, respectively.
So, let’s create two files called ServiceDefinition.csdef and ServiceConfiguration.cscfg
with the contents of Examples 3-2 and 3-3, respectively.
Example 3-2. Sample ServiceDefinition.csdef
<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="CloudService1"
xmlns=
"http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
<WebRole name="WebRole1" enableNativeCodeExecution="false">
<InputEndpoints>
<InputEndpoint name="HttpIn" protocol="http" port="80" />
</InputEndpoints>
<ConfigurationSettings />
</WebRole>
</ServiceDefinition>
Example 3-3. Sample ServiceConfiguration.cscfg
<?xml version="1.0"?>
<ServiceConfiguration serviceName="CloudService1"
xmlns=
"http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration">
<Role name="WebRole1">
<Instances count="1" />
<ConfigurationSettings />
</Role>
</ServiceConfiguration>
So, let’s create two files called ServiceDefinition.csdef and ServiceConfiguration.cscfg
with the contents of Examples 3-2 and 3-3, respectively.
Example 3-2. Sample ServiceDefinition.csdef
<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="CloudService1"
xmlns=
"http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
<WebRole name="WebRole1" enableNativeCodeExecution="false">
<InputEndpoints>
<InputEndpoint name="HttpIn" protocol="http" port="80" />
</InputEndpoints>
<ConfigurationSettings />
</WebRole>
</ServiceDefinition>
Example 3-3. Sample ServiceConfiguration.cscfg
<?xml version="1.0"?>
<ServiceConfiguration serviceName="CloudService1"
xmlns=
"http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration">
<Role name="WebRole1">
<Instances count="1" />
<ConfigurationSettings />
</Role>
</ServiceConfiguration>
Packing the Code for the Dev Fabric
You now have all the code you need to run the service on Windows Azure. However, to run applications on Windows Azure (either the production fabric or in the Dev Fabric), you must package the applications in a special format. This lays out your application binaries in a specific folder structure, and generates some files used internally by Windows Azure.
Run the command shown in Example 3-4 from the directory above htmlwebsite (which contains the index.html web page). This will “package” your application code and service definition file into a Windows Azure-understandable format in a directory named output.
Example 3-4. Packing the sample application for the Dev Fabric
D:\>cspack htmlwebsite\ServiceDefinition.csdef
/role:WebRole1;htmlwebsite /out:output /copyonly
Windows(R) Azure(TM) Packaging Tool version 1.0.0.0
for Microsoft(R) .NET Framework 3.5
Copyright (c) Microsoft Corporation. All rights reserved.
Example 3-6. Launching the Dev Fabric
D:\>csrun /run:output;htmlwebsite/ServiceConfiguration.cscfg
Windows(R) Azure(TM) Desktop Execution Tool version 1.0.0.0
for Microsoft(R) .NET Framework 3.5
Copyright (c) Microsoft Corporation. All rights reserved.
Created deployment(21)
Started deployment(21)
Deployment input endpoint HttpIn of role WebRole1 at http://127.0.0.1:81/ The Dev Fabric acts not only as a simulation of the Windows Azure fabric, but also as a local web server. In this case, as the last line of the command output indicates, it has launched your site at the local URL http://127.0.0.1:81. Since your web page was called index.html, navigate to http://127.0.0.1:81/index.html in any web browser.
Running the Code in the Dev Fabric
The final step in creating your first application is to run the packaged application in the Dev Fabric. To do that, run the command shown in Example 3-6. This uses another utility that ships with the SDK, CSRun, to point to the output files and to launch the Dev Fabric with your specified configuration.Example 3-6. Launching the Dev Fabric
D:\>csrun /run:output;htmlwebsite/ServiceConfiguration.cscfg
Windows(R) Azure(TM) Desktop Execution Tool version 1.0.0.0
for Microsoft(R) .NET Framework 3.5
Copyright (c) Microsoft Corporation. All rights reserved.
Created deployment(21)
Started deployment(21)
Deployment input endpoint HttpIn of role WebRole1 at http://127.0.0.1:81/ The Dev Fabric acts not only as a simulation of the Windows Azure fabric, but also as a local web server. In this case, as the last line of the command output indicates, it has launched your site at the local URL http://127.0.0.1:81. Since your web page was called index.html, navigate to http://127.0.0.1:81/index.html in any web browser.
Creating a new hosted service project
The Developer Portal is also where you create projects, which are either hosted services (that let you run code) or storage accounts (that let you store data in Windows Azure storage).
Uploading packages
Windows Azure offers you the same two environments. They’re called deployment slots, and there are two of them: staging (where you deploy your build if you want to test it before it goes live) and production (where the build goes live). Each slot can contain its own package, and each runs separately on its own URL. The production slot runs at the servicename.cloudapp.net URL you picked out, while the staging slot runs at a randomly picked URL of the form <some-guid>.cloudapp.net.
Using the Visual Studio Tools
Just as Visual Studio wraps around the C# or Visual Basic compilers to provide an integrated experience with the IDE, the Visual Studio extensions wrap around CSPack and the Dev Fabric to provide an integrated experience with Windows Azure. This also provides an additional important feature that is difficult to reach with just the command-line tools alone: debugging support. The following discussion assumes that you’ve finished installing the SDK and the Visual Studio tools. If you haven’t, follow the instructions presented earlier in this chapter on where to get the right bits and how to install them.
Let’s start by opening Visual Studio and selecting File→New→Project. Select the Cloud
Service template on the list
Let’s start by opening Visual Studio and selecting File→New→Project. Select the Cloud
Service template on the list
A cloud service solution contains only the service definition and service configuration files for your service. It doesn’t contain (by default) the actual roles that make up your service
The WebRole1 project itself is just an ASP.NET web application project under the covers. It is identical to a typical ASP.NET project in almost all ways. However, instead of hooking up with the ASP.NET development server, Cloud Service projects are automatically run in the Dev Fabric.
Looking at Worker Roles in Depth
A worker role is the Swiss Army knife of the Windows Azure world. It is a way to package any arbitrary code—be it something as simple as creating thumbnails, to something as complex as entire database servers. The concept is simple: Windows Azure calls a well-defined entry point in your code, and runs it as long as you want.
Creating Worker Roles
Creating a worker role is simple using the Visual Studio tools. Figure 4-7 shows how to add a new worker role to your project using the new cloud service dialog (you can add one to an existing solution, too, if you prefer). This will generate a new Visual Studio project for you, along with the correct entries in ServiceDefinition.csdef.













No comments:
Post a Comment