Thursday, October 15, 2009

Azure: Fabric Controller

In Azure, the Fabric Controller's view of a deployed service is a bit like this:



There is a logical service, which consists of several logical roles (e.g., for our example application, the service was the overall application and the roles included a Web and two Worker roles). There may be multiple instances of any single Logical Role, each of which is a Logical Role Instance (LRI). In the above picture, there are three instances of the Logical Role R_2. Each LRI is mapped to a single Logical Node, which is a hosting environment (e.g.,  VM). There may be multiple Logical Nodes (VMs) running on any given Physical Node or computer.

The purple ovals correspond to information provided by the application developer. That is, in the configuration (service model) for this app, the developer provides the Service Description, a Role Description for each role (which provides attributes of that role, such as, what hosting environment is required, how many resources are needed, and so on), and a Role Instance Description. The Fabric Controller maps each Role Instance Descriptor to its corresponding Logical Role Instance. Similarly, each Role Descriptor is mapped to the appropriate Logical Role. (Note that not all of these connections are shown in the above picture.)

In this way, deploying a service consists of mapping the graph describing an application's topology (as provided in that app's service model) TO the graph describing the inventory managed by the FC.

Driving the logical nodes
When an LRI is bound to a logical node (VM), that logical node's goal state is set to be whatever the LRI's goal state is. For example, Logical Node LN_22 in the above has a goal state which corresponds to LRI_22's goal state. The logical node also has a current state, which is obtained from the physical node on which this VM is running. The current state is kept up-to-date by communicating with the underlying physical hardware.

They don't say how it's constructed but evidently  in addition to knowing the current state and the goal state for each VM (OK)  the state machine for the corresponding LRI is also known. A state machine identifies all the possible states an instance can be in as well as what events cause transitions between what states. There is a stream of events  obtained via polling or interrupts  which provide updates on the physical hardware. These events are used  along with the state machine  to determine the current state. Periodically, the FC figures out what the appropriate next steps are given: the current state, the state machine, and the goal state. Then those actions are performed.

Identifying when there is a problem and handling it (automatically)
The FC maintains a cache of what it believes is the current state of each node. The Fabric Agent (which lives on each node) communicates with the FC to help the FC keep this cache updated. The FC detects when a Role dies. Multiple sources participate in monitoring Role health (and notifying the FC when a Role is not healthy): the Load Balancer issues probes to machines (i.e., pings them), a Role can also notify the FC that it (the Role) is unhealthy. When the FC learns (through whatever channel) that some Role is not healthy, the FC updates the current state for that node. Then the appropriate next steps to take  to get that node closer to its goal state  are determined and undertaken.

When a node goes offline, an attempt is made to recover that node. If the reason was a hardware failure (or, generally, something which cannot be remediated automatically) the role is migrated to another node.

(Note that the above exposition would have benefited greatly from a few concrete examples  as far as what a typical goal state is, what a typical current state is, how the state machine (which identifies what's needed to transition between states) for a logical role instance is created, and so on.)

Wednesday, October 14, 2009

More detailed notes on Azure architecture

Below we have some notes from: "Introducing the Windows Azure Platform" (by David Chappell)
http://go.microsoft.com/fwlink/?LinkId=158011 as well as from the video presentation (from PDC 2008) by Erick Smith and Chuck Lenzmeier, "Under the Hood: Inside the Windows Azure Hosting Environment " http://channel9.msdn.com/pdc2008/ES19/. The PPNT slides associated with that video lecture can be found here: http://mschnlnine.vo.llnwd.net/d1/pdc08/PPTX/ES19.pptx.

On:
  • General comments
  • A motivating app
  • Azure OS Compute  Roles, Agents, What an app consists of
  • Azure OS Storage  Blobs, Tables, and Queues. How one can store, access, and update data using the Azure OS.
  • The Fabric Controller: manages application provisioning (resource allocation), deployment, monitoring, and maintenance (including identifying when an app instance has gone down and needs to be restarted, when other hardware needed for the app has failed, when an app needs to scaled up or down in response to demand).
  • The service model: a contract between the application creator and the Azure platform. A way for the app creator to define how he wants his app to be managed. The FC enforces the Azure-side of this contract  ensuring the app stays "healthy," according to the definition of health provided by the app's creator.
  • The service lifecycle  and its constituent parts.
  • Miscellaneous Other   including security
General comments
At a high level of abstraction, think of the Azure Platform as a gigantic computer  though in actuality it consists of many physical computers, switches, routers, load balancers, and so on. A single logical OS runs over all of these different pieces of hardware and manages them. So in the same way that a user adds a physical device (such as a NIC) to his PC  by installing the driver for that device  the Azure OS adds switches, computers, and so on via installing the driver for the device. The driver in this case may not necessarily be the software that ships with the hardware (as would be the case when a user installs some hardware on his PC) but instead some Azure-specific code which lets the Azure OS monitor and fully manage the given machine, whatever it may be.

As noted, a user can run an app on Azure OS or store data using Azure OS or both. If he runs an app on Azure OS, he can perform horizontal scaling  that is, run more than one instance of the same app on different machines. In effect the app is replicated across different resources so that the overall app bandwidth (e.g., ability to perform a distributed computation or serve clients) is increased.

An app itself can consist of more than one "role," where there are front-end roles (those which receive HTTP/S connections) and back-end roles (those which do not accept incoming network connections; they are more processing or computation focused). So there can be replication of a particular role (e.g., create a new instance of the front-end "Web" role for every 1500 new client connections I receive) or of the entire app itself (e.g., create another instance of this entire web application, where that instance will include some number of instances of front-end and back-end roles). (CONFIRM THIS: it is certainly the case that one can replicate instances of particular roles; unclear whether replication can also be done at the level of an entire app, i.e., at app-granularity.)

A motivating app  for concreteness
I think our exploration of Azure would benefit greatly from considering some particular example app. So let's imagine the following app (that I just made up right this second): a credit-checking app used by landlords to determine whether or not to rent to some would-be tenant. The front-end of the app accepts a connection from a landlord who does two main things (via filling out a Web form): (1) provides all of the information about the would-be tenant (name, age, birthdate, SSN, previous addresses, previous landlords, references, ...); (2) provides order-fulfillment information (identifying where the finished report should be sent, how the landlord is going to pay for use of the service, and so on).

Let's assume we can use a single front-end role to accept this information. Depending upon the design of the app, the landlord may need to submit more than one form (e.g., one form for info about the would-be tenant and a second form for order fulfillment info). For simplicity, let's assume there's a single form and a single front-end role. Naturally, the number of instances of that front-end role should depend upon the level of demand for the service at any given instant of time. (Note that an app can increase the number of front-end instances, back-end instances, or both. Similarly, an app can "shed load" by killing instances of front- or back-end roles.)

Then there are at least two back-end roles: one for payment processing and a second for generating and supplying the report. The back-end roles can make outbound network connections, as necessary, to process the credit card payment or to retrieve info on the would-be tenant (even to invoke other Web services to perform some part of the due diligence involved in report generation), and so on.

Clearly there must be some information sharing across these roles; the info gathered by the front-end role needs to be made available to the back-end roles. Further, if there are multiple instances of each back-end role, those instances need to cooperate so that they don't duplicate one another's work. There are a number of different ways to structure this workflow; an obvious approach is for the front-end role to populate a record or document with the info it has gathered then to add that record/document to a queue  where the back-end roles take from this queue. So there's the familiar producer/consumer relationship there. Clearly, we want some kind of notification to be sent to idle Workers whenever a new job is available.

I'm not sure whether it's even possible for the Web role to create a network connection directly to the Worker role. We know that the outside world cannot connect to the Worker role; not sure whether that prohibition also extends to internal network connections. Depending upon the support for real-time notification of Workers that work is available, it might be appealing to be able to directly poke a worker. Some tradeoffs in polling versus interrupt-based notification but no obvious new twists on those tradeoffs presented here.

Azure OS Compute
True of both a Front-End Role and a Back-End Role:
  • Runs in a VM.
  • Can interact with Azure Storage:
    • Read/write messages from/to a queue.
    • Read/write data from/to a table.
    • ...
  • Interacts with an Agent, which runs in the same VM.
  • Can write to / read from the local filesystem (within the VM); such changes will not persist across reboot though.
A Front-End Role (referred to in MSFT literature as a "Web Role")
  • Receives HTTP/S connections.
  • Runs on IIS7 (in a VM).
  • Is stateless with respect to clients. What does that mean? Each separate "request" from a client (e.g., HTTP GET or POST message) could be routed to a different front-end role. Hence, all important client state must either be returned to the client (so that he can present that info with each request he makes) or put in the database (where all front-end roles can access the info). Clients can retain and re-present state via maintaining it in a URL parameter or cookie.
A Back-End Role (referred to in MSFT literature as a "Worker Role")
  • Does NOT receive any incoming network connections.
  • Can make outbound network connections.
A Windows Azure Agent
  • Runs in the same VM as the front- or back-end role.
  • Exposes an API which the Roles can invoke in order to:
    • Write to the log
    • Send an alert to the application owner
    • ...
  • Is the way that an app can interact with the Fabric Controller.
An app consists of:
  • N instances of Front-end Roles (can be zero)
  • M instances of Back-end Roles (can be zero)
  • A "hardware" load balancer (only necessary if have >1 Web or front-end role)
  • A config file
    • How many Web role (front-end) instances
    • How many Worker role (back-end) instances
  • A model: which defines what it means for this app to be healthy, the ways in which the app can be unhealthy (i.e., constraints which, when violated, indicate unhealthiness), remedial actions to take when an app is unhealthy (e.g., healthy means 1500 or fewer client connections per Web role; if all Web roles are currently managing 1500 client connections and the app receives another client connection, create a new instance of the Web role and send the new client to that instance). 
  • A unified log which contains messages received both from Web and Worker instances.
Azure Storage
"All info held in Azure Storage is replicated three times." Provides strong consistency.

Every blob, table, and queue is named using URIs and accessed via standard HTTP operations. So these objects can be accessed locally (by Worker and Web Roles running in the same data center, for example) or they can be accessed across the WWW. Can use REST services to identify and access each Azure Storage data object. There are libraries which encapsulate the REST protocol operations (e.g., ADO.NET Data Services or Language Integrated Query (LINQ)) or one can make "raw HTTP calls" to achieve the same effect. If query a table stored in Azure Storage and that query has lots of results, can obtain a continuation object which lets you process some results then get the next set of results, and so on. So a continuation object is like a list iterator.
  1. Blobs: a blob consists of binary data. Can be upto 50GB each. Can have associated metadata.
    • A storage account can have multiple containers; a container cannot contain another container (i.e., containers can't be nested). A blob name can contain slashes though  so can create the illusion of hierarchical blob storage.
    • Each container (akin to a file folder) holds blobs.
    • Can subdivide a blob into blocks then if there is a network failure while that blob is being downloaded, don't have to re-download the entire blob  just the blocks that weren't successfully delivered the first time.
    • Each container can be marked public or private.
      • Public blobs: can be read by anyone; writing requires a signed token.
      • Private blobs: both read and write operations require a signed token.
    • A blob can be reached by:
      http://<StorageAcct>.blob.core.windows.net/<Container>/<BlobName>
  2. Table: actually a set of entities, each of which has a set of properties (and values for those properties).
    • Access table via conventions defined in ADO.NET Data Services. Cannot query a table in Azure Storage using SQL. And a table cannot be required to conform to a particular schema.
    • The reason for this non-relational representation is that it makes it easier to "scale out storage," i.e., to spread the data in the table across machines. Harder to do this if the data is stored in a relational table (what's hard is probably performing aggregate functions over the entire table, which is not allowed for tables Azure OS Storage).
    • An Azure table can contain billions of entities holding terabytes of data.
    • A table consists of some set of entities
      • Each entity consists of some set of properties.
        • Each property consists of a Name, Type, and Value.
    • So in a way, we can think of a table as a nested structure: every property conceptually consists of a table which holds three tuples, in particular those which give the property's name (<Name, NameVal>), value (<Value, ValueVal>.), and the type of that value (<Type, TypeVal>). Supported types include Binary, Bool, DateTime, Double, GUID, Int, Int64, String. A property's type can change over that property's lifetime, depending upon what value is currently stored for the property.

      For example, some property p_12 might be represented by tuples: <Name, "CreationTime">, <Type, "DateTime">, and <Value, "12-01-2009 05:43:22 GMT">. So this property (p_12) says that the entity that it describes was created on December 1st, 2009, early in the morning (which is interesting since that date has not yet arrived but no mind!).

      The next layer of nesting is that an entity contains a table consisting of all of that entity's properties. The final layer of nesting is that an actual Azure Storage Table consists of some set of entities and each entity's associated properties table.
    • Each entity can be upto 1MB in size. Accessing any part of an entity results in accessing the entire entity.

  3. Queue: the primary purpose of a queue is to provide a way for Web roles and Worker roles to communicate with one another. A queue contains messages, the format of which can be application-defined. Each message can be upto 8KB. A Worker role reads a message from the queue  which does not result in the message being deleted. Rather, after a message is read, that message is not shown to anyone else for 30 seconds. Within that 30 seconds, the Worker who read the message could have handled the task and deleted the message. Or that Worker might have died before completing the task, in which case, the message would be shown to other Workers after 30 seconds have elapsed. This 30-second period is referred to as the visibility timeout.

    Note that a message might be delivered multiple times. Also, there is no guarantee on the order in which messages are delivered. Presumably this means that a particular message on the queue might get "starved," i.e., never delivered to any Worker (in contrast to traditional FIFO queue semantics).
Azure Storage  Access control: it sounds like they use a token-based system in order to control who can access what data. In particular, when a user creates a storage account, she's given a secret key. Every request made to the storage account (to modify account settings, to access some blob stored as part of this account, to edit a table, and so on) carries a signature which is signed by this secret key (like a token). Having such a token means that you can access all the data stored in that account (as well as the account settings themselves?); i.e., the granularity at which access control decisions are made is account-level. Presumably, a user Alice who owns an account can provide a signed token to other users whom Alice wants to allow to read/modify stored data (CONFIRM.) Regrettably, it appears that Alice has to let these other users have all types of access (create, modify, delete) to everything (all data as well as account settings?). That said, part of the Windows .NET Cloud Services is access control so perhaps this could be layered on top of storage in order to achieve a more fine-grained access control system? (CHECK.)

Update: there is evidently now a way to provide container-level or blob-level access (rather than only being able to give account-wide access). See: http://blog.smarx.com/posts/new-storage-feature-signed-access-signatures
Windows Azure storage operates with “shared key authentication.”  In other words, there’s a password for your entire storage account, and anyone who has that password essentially owns the account.  It’s an all-or-nothing permission model.  That means you can never give out your shared key.  The only exception to the all-or-nothing rule is that blob containers can be marked public, in which case anyone may read what’s inside the container (but not write to it).
Signed access signatures give us a new mechanism for giving permission while retaining security.  With signed access signatures, my application can produce a URL with built-in permissions, including a time window in which the signature is valid.  It also allows for the creation of container-level access policies which can be referenced in signatures and then modified or revoked later.  This new functionality enables a number of interesting scenarios...
Azure Storage  Concurrency: The basic model appears to be that a user or app would download a blob or entity then modify that blob/entity locally. After that, the user/app would attempt to commit those local changes  by writing the updated version back to storage. So, what happens if more than one app attempts to modify the same storage object (e.g., blob or table) at the same time? The first writer wins. Subsequent writes (of the same version) will fail. This is a form of optimistic concurrency and is achieved via maintaining version numbers. Note that an app can force his changes to stick by unconditionally updating an entity or blob.
    The Fabric Controller (FC)
    One per datacenter. Is replicated across several machines (maybe 57). Owns all of the resources in the fabric (including computers, switches, load balancers). Communicates with each machine's Fabric Agent. Every physical machine has a Fabric Agent running on it which keeps track of how many VMs are running on this machine, how each Azure app in each VM is doing, and so on. The FA reports this info back to the FC which uses it (along with the app's configuration file or model) to manage the app so that the app remains healthy (as defined in the app's configuration file). 

    The FC monitors all running apps, manages its own infrastructure (e.g., keeps VMs patched, keeps OSs patched, ...), and performs resource allocation using its global view of the entire fabric along with the configuration file for the app that needs to be deployed. All of this is done automaticallyThe FC presently (i.e., in the CTP) allocates VMs to processor cores in a one-to-one manner (that is, no core will have more than one VM running on it).


    The FC maintains a graph which describes its hardware inventory. A node in the graph is a computer, load balancer, switch, router, or network management device. An edge is a networking cable (e.g., connecting a computer to a switch), power cable, or serial cable. When an app needs to be deployed, the FC may allocate an entire physical machine or, more commonly, may subpartition that machine into VMs and allocate one or more VMs to the given role. The size of the allocation depends upon the configuration settings for that role.

    The service model
    Is "declarative," which means rule-based. So the model or specification consists of a set of rules. Those rules are used by the Fabric Controller in initially allocating/provisioning resources for your service and to manage the service over its lifetime (i.e., maintain its health). The model contains info that you might otherwise communicate to the IT department that was going to deploy and run/manage your service in-house. This technique/setup is referred to as model-driven service management.

    What it looks like:
    • Contains the topology of your service (which comprises a graph):
      • Which roles? How many instances of each?
      • How are the roles linked (i.e., which roles communicate with which other roles)?
      • What interfaces exposed to the Internet?
      • What interfaces exposed to other services?
    • For each role, defines attributes of that role, e.g.,
      • What hosting environment does that role require? (E.g., IIS7, ASP.NET, CLR, ...)
      • How many resources does that role need? (e.g., CPU, disk, network latency, network bandwidth, ...)
    • Configuration settings: all are accessible at run-time. Can register to be notified when the value of a particular setting changes.
      • Application configuration settings: defined by the developer. Can think of these as being akin to command-line arguments that one would pass to a program. The app will behave differently depending upon the values of these arguments. Can use these to create different versions of your app where the versions differ on the basis of their values for these configuration settings.
      • System-defined configuration settings: predefined by the system. 
        • How many fault domains?
        • How many update domains?
        • How many instances of each role?
    • In the CTP, don't get to use entire service-model language. Instead, will be provided with various templates; can choose one and customize it for your particular app. Eventually, more flexibility (a higher level of control over service model configuration) will be exposed.
    The service lifecycle
    1. [Developer] Develop code. Build/construct service model.
    2. [Developer/Deployer] Specify desired config.
    3. [FC] Deploy app; FC maps app to hardware in the way specified in the config/model.
      1. Resource allocation: choose hardware to allocate to the app based on the app's service model. Is basically a ginormous constraint satisfaction problem. Performed as a transaction: all resources allocated for an app or none are. Example constraints include: 
        • Question: Are there both system constraints and application-defined constraints? Then both sets are combined? For example, some of the sample constraints seem like something that the system would specify, such as "Node must have enough resources to be considered" or "Nod must have a compatible hosting environment" whereas others seem more app-specific, as in "Each node should have at most a single instance of any given role assigned to it."
      2. Provisioning: Each item of allocated hardware is assigned a new "goal state." The FC will drive each piece of hardware to its goal state.
      3. Upgrades:
    4. [FC] Monitor app state and maintain app health, according to the service model and the SLA.
      • Handle software faults, hardware failures, need to scale up/down.
      • Logging infrastructure gathers the data needed to diagnose an app as unhealthy (where "health" is application-defined and provided in the app's service model).
      Miscellaneous Other
      • Development environment allows desktop simulation of the entire Azure cloud experience. Can run your app locally and debug that way.
      • Can specify which data centers your app runs in and stores its data in.
      • Two-stage deployment to the cloud (which can also be used to achieve zero-downtime upgrades):
        • Upload application, access via .cloudapp.net, where cloudapp.net points to a load balancer (i.e., the DNS entry contains the load balancer's virtual IP (VIP)). Then connections to the app will be routed through this load balancer.
          • Presumably anyone can connect to the app (i.e., it's world reachable) but not everyone knows the particular globally unique ID associated with this app?
        • To make the app run live, DNS entry created which ties the application's domain name (e.g., myazureservice.net) to the load balancer's VIP.
      • Can layer any desired HTTP-based authentication mechanism on this. Or can use Windows Cloud Services (the Live Service in particular) which provides authentication.
      • Machines are grouped into fault domains, where a fault domain is a collection of systems with some common dependency, which means that this entire set of machines could go down at the same time. For example, all machines that sit behind a single switch are in the same fault domain as are all machines that rely on the same power supply. 
        • In configuring the settings for your Azure app and storage, you might specify that two instances of the same thing (e.g., two identical Web roles or two instances of Azure Storage) should not be located within the same fault domain.
        • And actually, for the case of Azure Storage, the Storage app takes care of making sure that it has more than one version of itself and is spread across different fault domains (rather than the Fabric Controller doing this).
        • Easy to determine  given the FC's graph of all inventory (i.e., hardware)  what the fault domains are. Can determine statistically what the likelihood of failure is for each such domain. 
      • Update domains: if you're doing a rolling upgrade of your service, a certain percentage of your service will be taken offline at any time (and upgraded). That percentage is the size of your update domain.
      • Security/Isolation
        • Use VMs to isolate one role from others
        • For PDC, app can only be managed code, which can be easily sandboxed
        • Run app code at a reduced privilege level
        • Network-based isolation
          • IP filtering
          • Firewalls: each machine configured to only allow traffic that machine expects
        • Automatically apply Windows security patches (heh).
      • MIX 2009 introduced the ability to run Web or worker roles with more privileges than previously was allowed. In particular, there are Code Access Security (CAS) levels, such as Partial Trust and Full Trust. In order to be able to run non-.NET (a.k.a., non-managed or native) code, the Web or Worker role needs to be able to spawn a process or to issue the Platform Invoke command. Apparently, one can only do those things with Full Trust. Hence, this is how MSFT will support non-managed code: you will configure your Web or Worker role to execute your non-managed code. So the Web or Worker role bootstraps your code.
        http://blogs.msdn.com/windowsazure/archive/2009/03/18/hosting-roles-under-net-full-trust.aspx

        Thus, in order for this to work, the code you execute has to be copy-deploy, which means that you don't have to install the code (since even Full Trust
        does not include administrative or install privileges)  instead, you can merely copy the code to the VM and run it.
        http://blog.smarx.com/posts/does-windows-azure-support-java

      • The MSFT VMM is based on their Hyper-V Hypervisor, which was released in June 2008.
          TODO:
          • Read up on REST.
          • Get more details on fabric controller and data-center architecture etc.
            • Can see more about the FC  as far as redundancy goes  here (starting at about 30:00).
          • Get more details on the "service model," which is how the app developer communicates with the Azure OS Compute platform about how to manage his app.
          • More details on how the FC drives nodes into the goal state (i.e., monitors and maintains app health).
          • How do these virtual IPs and dedicated IPs (where the latter make up a pool that "back" the virtual IPs)  how does all of this work? Presumably there is some address translation at some point (mapping a connection to a virtual IP to instead be to a real IP)  like NAT  but at which point? At the load balancer?
            • Maybe just means an internal IP address, such as 192.168.1.1?

          More on Azure

          Below is some basic data on Windows Azure as well as brief exploration into relevant background topics (such as managed code and dynamically generated Web page content). The below contains no details on the various Azure Services (such as Live Services and .NET Services) and few details about the Azure OS run-time. It's all pretty high-level and gauzy.

          Future posts may fill in more details such as:
          • What does execution look like on the cloud (what kind of isolation between web apps and so on)? Each machine in the fabric runs one instance of Azure OS? And each instance of Azure OS runs an instance of IIS7? And is there a one-to-one mapping between apps and instances of Azure OS (surely not)? Etc.
          • How exactly can one specify the health constraints of his app (where the Fabric Controller uses these conditions to automatically manage the app/service  including scaling the app up or down (creating/deleting instances) or restarting the app)?
          • What do the various Service APIs look like? (E.g., for SQL Data Services, Live Services, .NET Services, and so on) What operations are available?
          • How will MSFT support web apps written in non-MSFT languages (e.g., python), i.e., how will incorporate support for non-managed code?
          • ...
          Notes from Manuvir Das, "A Lap Around Azure"


          Windows Azure == OS for the cloud.
          It's the lowest layer. Services are layered on top of this foundation, including:
          • Live services
          • .NET services
          • SQL services (a.k.a. "Data Services")
          • SharePoint services
          • MSFT Dynamics CRM services
          These services are implemented using REST, HTTP, and XML.
          What's a cloud? A set of connected servers.
          What can you do on the cloud (i.e., on those servers)? Install and run services; store and retrieve data.
          A client can access Azure services via calling into a "managed class library" (presumably some code that contains RPC-like stubs which  at run-time  invoke the appropriate code).

          In the desktop computing world, an OS provides:
          • An environment in which to run an app (abstracts away the particular underlying hardware config)
          • Access to a shared file system, which provides isolation via access control
          • Resource allocation from a shared pool
          • Support for different programming environments
          In a cloud computing OS, you want all of the above plus 24/7 operation, pay-as-you-consume, transparent administration (i.e., which hides the complexity of remote mgmt as much as possible).

          What features does Azure provide?
          • Automated service management.
            • The developer defines rules about which code should be executed under which conditions (where a condition might be, "URL x was visited") as well as the code itself. The platform follows these rules in deploying, monitoring, and managing the developer's service.
          • A powerful hosting environment. All of the hardware for actually running and serving your code (servers, load balancers, firewalls?). Two possible execution modes: direct and virtualized (where the latter is via a hypervisor).
          • Scalable and available storage in the cloud. Provides abstractions such as: blobs, tables, and queues.
          • A rich familiar developer experience. Includes a local testing/debugging environment which provides a complete simulation of the cloud. So that developers can test their app in an environment as close to the real thing as possible. The simulated enviro lets the application do everything it would be able to do if it were actually running on the Azure OS in the cloud.
          More on Automated Service Management (Is this the so-called Fabric Controller? Yes, I think so.)
          The developer writes his code as before. But he also now creates a model, which specifies:
          • What should the service topology look like? How big? Use cloud storage? How many front-end roles? How many back-end roles? Should the front- and back-end roles be able to talk to one another? If so, how? (Note that these various roles can connect to the outside world, which enables the developer to communicate with a particular role directly.) A common idiom is for the different roles (e.g., Web Role, Worker Role) communicate via shared storage  i.e., one adds to a queue (is a producer) and the other takes from that queue (is a consumer).
          • How to define health of my service (i.e., conditions under which I should be alerted because such conditions indicate that the service needs attention of some sort).
            • MS detects failures (of hardware, software, bugs, ...)
            • MS detects violation of health constraints (i.e. detects when your app's execution is outside of the definition of healthy)
            • MS needs a way to transparently fix stuff (ideally w/o a human in the loop). Maybe by rebooting your service, moving it to another server.
            • Achieve this via abstraction. Application code refers to logical resources (rather than to particular IP addresses or underlying CPUs). If need to obtain the actual hardware addresses at run-time, can invoke APIs which provide physical values (e.g., addresses) corresponding to specified logical resources.
          • Configuration settings: what particular values or parameters do I want to be able to change at run-time without having to reploy the entire service?
          Note that you don't *have to use* their automated, abstracted thing. They provide a so-called "escape hatch" or Raw mode, which lets the developer build a VM from the ground up and run his service within that VM (where the developer would be responsible for managing that service as well). So this offering much more closely resembles Amazon's EC2 service except even with this the developer doesn't actually supply his own VM (as he would with EC2) but rather configures one of their VMs.

          More on Azure Storage: massive scale, availability, durability. Geo-distribution. Geo-replication. This is NOT to be confused with SQL Services because Azure Storage does not expose the full database-management interface (which would include querying, insertion, schema creation, and so on). Only can upload data (and presumably delete it?). And available ways to structure data are very simple.
          • Large items of unstructured data: Blobs, file streams
          • Structured data (referred to as "service state"): Tables, caches
          • Service communication: queues, locks
          Cloud Storage is accessible from anywhere on the Internet. Has REST APIs on it.

          More on Developer Experience
          Support for a variety of programming languages: ASP.net, .NET, native code, PHP
          Bunch of tools and support, including for logging, alerts, tracing, ...
          Including the much touted "desktop SDK for full simulation of the cloud"

          My own look at things  questions and comments
          • Can I only run a web application on Azure? E.g., if I wanted to run a mail server on Azure, could I do that or not? If not, what mechanisms are actually  used to prevent this? Restrict traffic that's not on a standard HTTP/HTTPS port?
            • Yes  according to this PDC 2008 session, an input endpoint (which is an app's externally-reachable interface to the world) must use either port 80 or port 443 (note that an app doesn't have to have an input endpoint). Hence, you couldn't define an app to run on Azure which had an input endpoint of port 25 (for SMTP), for example.
          • For the non-MSFT languages which can run on Azure (e.g., PHP, Python, Ruby), what run-time environment does a program in such a language run? Does the user choose the particular run-time environment?
          • And then what about sandboxing? What kind of isolation among various apps running on the Azure Fabric?
          Since someone will ask, what are the benefits of moving an app to the cloud?
          • If you have a customer-facing web application and you have customers scattered across the globe then can use Azure to run an instance of your app at various geographic locations. This will have the effect of reducing latency for customers accessing your application from India, for example.
          • Availability: Relieves the user from having to maintain redundant servers (and infrastructure for fault tolerance); let MSFT handle that.
          • Scalability: Let MS also handle scaling of your application automatically.
            • Frees app-provider up from responsibility of maintaining a number of machines for the service that corresponds to the expected peak load for that service (despite the fact that normal use might be well below peak).
          • Zero-downtime upgrades.
          • If you need huge amounts of storage or the ability to do batch processing or the ability to run an application on a very large data set (for example, as with MapReduce), you can achieve this by running your app / doing your processing in the cloud. Frees you up from having to purchase and maintain physical resources, especially since the job may only be a temporary thing (and hence the physical resources would be idle most of the time).
          At some point, will look in more detail at each Azure Service being offered  to understand what can do with that service, who uses it, and so on. But for starters here they are:
          • Compute on Azure OS.
          • Store using Azure OS.
            • Cheap, efficient, not necessarily very expressive.
            • That is, this is NOT a full-service relational database interface that enables SELECT, INSERT, and so on. This is a very simple interface that only exposes a couple different storage formats (blob, queue, simple [non-relational] table) and presumably only a couple ways to manage (operate on) the stored data.
          • Access Live services for...
          • Access .NET services for...
          • Access SQL services for...
          • Access SharePoint services for...
          • Access MSFT Dynamics CRM services for...
            • CRM: Customer Relationship Management; software and/or processes/strategies. Includes all methods by which a company responds to (or reaches out to) its customers. So call center, sales force, marketing, tech support, field services. Players in this area: Oracle (Siebel, PeopleSoft), SAP, salesforce.com, Amdocs, MSFT, Epiphany, and others.
          The so-called "Fabric Controller"
          The Windows Azure Fabric is a "scalable hosting environment" that is "built on distributed MS data centers." The Fabric Controller manages resources, performs load balancing, observes developer-provided constraints/requirements for an app as well as the real-time conditions for that app/service and responds accordingly (by, for example, automatically provisioning additional resources, restarting the service, taking away some unneeded resources, and so on). The FC scales app resources automatically as demand rises and falls. Used to deploy service and manage upgrades.

          The Azure OS performs advanced tracing and logging on apps that run on it  so that developers can monitor the status of their apps as far as compute, storage, and bandwidth. Presumably, the constraints that developers specify are in terms of the type of things that can be observed using this tracing. Note that since MS is allowing apps written in non-MS languages, the types of "signals" that can be used to monitor the health of an app are likely OS-level (rather than language-level) things. For example, easy to identify the # of open sockets and heap size. More difficult for an OS to have visibility into language-level resource usage such as the number of locks created/acquired and so on. Hence, the types of things that one can specify in the application model must be rather generic and observable from the OS-level. I wonder whether  for apps written in MS languages and which will run in an MS environment (such as .NET framework)  one can specify a different (richer) set of constraints in the application model. 

          • Managed code == code written using an MSFT language and which executes on MSFT run-time; e.g., .NET, IIS7, WCF
          • Can only run a web application on Azure OS, not an arbitrary network app.
            • Enforced via only allowing an app to have an input endpoint whose port is 80 or 443; this means that the app can only receive traffic on the port for HTTP or the port associated with HTTPS.
          • The cloud on your desktop: complete offline cloud simulation. Actual cloud == set of connected servers or machines, also referred to as the fabric; it's what your app will run on. The "cloud on your desktop" is a set of processes (all of which run locally), where each process simulates a server. So the set of local processes are your "cloud on the desktop"; also referred to as "development fabric."
          • There's a UI for playing with / seeing this development fabric; can see each service deployment. For each service deployment, can see all the roles that this service has defined and, for each role, all of the instances of that role.
          • When create a new project, two associated files are created: the service definition file (ends in *.csdef) and the service configuration file (ends in *.cscfg). These are XML files which contain metadata about your service. The service definition file defines all of the roles and, for each role, defines the input endpoints for that role (where an endpoint consists of a name, protocol (e.g., HTTP or HTTPS), and port (80 or 443, respectively). If there are any configuration-specific settings or parameters, those parameters are declared here (but given values elsewhere). The service configuration file identifies for each role the number of instances of that role (that should be created) and  for any configuration settings declared in the service definition file  provides values for those settings/parameters.
          Horizontal scaling: have code running on a single server; to scale == add more servers
          But what about state? How to share state across various instances? (Why need to share state? Because a user could interact with a different instance each time and the fact that there are multiple instances needs to be transparent to that user  i.e., the user can begin a transaction on one server and continue that transaction on another) Solution: Use a single centralized database (a "durable store") to store all state; no server stores any state locally. All servers access state from this store == all servers have same view of state.

          Have this available in Azure as the Azure OS storage: blobs, simple tables, queues. Access this storage via REST and ADO.NET Data Services.

          A brief detour  some background on Microsoft languages and run-time environments
          Evidently Microsoft has developed an infrastructure which enables code to be written in any one of several different languages then run on a variety of platforms. Parts of their infrastructure are reminiscent of Java with its bytecode and platform-specific JVMs. In particular, MSFT created various languages (e.g., C#, J#, VB.NET). Each of these languages has its own compiler which takes a program in the language and produces code in an intermediate language. This code is platform-agnostic (much the same as Java bytecode); the intermediate language is CIL  Common Intermediate Language. Then for each different hardware/OS platform, there is a Common Language Runtime (much the same as there are different JVMs for each hw/OS platform). CIL code is executed on the CLR. The CLR compiles and caches the CIL code just-in-time to the appropriate hardware instructions given the underlying CPU architecture (e.g., x86).

          So the Microsoft languages for which a compiler exists (which converts a program in that language into a program in the intermediate language, CIL) are referred to as managed; so a program written in C#, J#, or VB.NET is managed code. Such code (after being compiled into CIL) executes on the CLR, which is a managed environment. Unmanaged code by contrast is that which is not compiled into CIL and does not run within the CLR. So when we say that something is a CLI Language, we mean that there is a compiler that takes a program in that language and produces the corresponding CIL code. (Fyi, CIL was "previously known as MSIL  Microsoft Intermediate Language.") Actually, "managed code" is a general term which covers any code that runs within a VM (rather than executing directly on the underlying hardware). A C++ program written with Microsoft Visual C++ could be compiled into managed code (to run in the .NET CLR) or unmanaged code (using the old MFC framework). But in general all code written in a particular language will be compiled into managed or unmanaged code (rather than being able to be compiled at will into one form or the other).



          The CLR evidently exports an API that offers a lot of functionality that is usually provided by an OS. In particular, the CLR provides functions for: memory management, thread management, exception handling, garbage collection, and "security." This functionality is provided in the Class Libraries. The class libraries implement common functions such as: read/write, render graphics, interact w/DB, manipulate XML documents. The .NET Framework Class library consists of two libraries:
          1. Base Class Library (BCL): small subset of entire class library. Core set of classes that serve as the basic API of the CLR. Many of the functions provided by MSCORLIB.DLL and some of those provided in System.DLL and System.core.DLL. Akin to the standard libraries that come with Java.

          2. Framework Class Library (FCL): superset of BCL classes. Entire class library that ships with .NET Framework. Includes expanded set of libraries: WinForms, ADO.NET, ASP.NET, and others.

          CLR diag.svg

          So, as portrayed above, the Common Language Infrastructure (CLI) consists of the intermediate language (CIL) and the environment within which that intermediate language executes (the CLR). Together the CIL and the CLR comprise the CLI.

          What kind of benefits do you get by using a CLI?
          • Can have a program which combines components written in different high-level languages.
          • Can compile once (into CIL) and run anywhere (for which there exists a CLR). 
          • There are also some benefits of managed code generally: the ability to provide stronger type safety (since can do run-time type checking), garbage collection
          Another brief foray away from the main topic here: Web Page Content
          A web page can have only static content (the web page's content doesn't change) or it can have dynamic content. In the latter case, where does the dynamism come from? Two possibilities.
          1. There might be client-side scripting which changes the way the page is presented depending upon mouse movements, keyboard input, or timing events. So the dynamism is in how the content is presented. Languages used to achieve this include JavaScript (part of Dynamic HTML) and Action Script (part of Flash). These scripts might inject sound, animation or change the text. One can also perform remote scripting using these languages; remote scripting entails a Web page asking for more information from the server without having to reload the page. We see this in XMLHttpRequest, for example.

          2. Secondly, server-side scripts might dynamically generate different web page content depending upon the data provided by the user in an HTML form (e.g., the user enters his name (John) and the generated page says, "Hi John! Welcome to..."), URL parameters, the browser type, or DB/server state. The server-side languages used for this type of dynamic content generation include: PHP, Perl, ASP, ASP.NET, JSP, ColdFusion, and so on. These languages use the Common Gateway Interface (CGI) to produce dynamic web pages.
          Suffice it to say, we could explore these topics much more carefully and thoroughly. But the above suffices for our purposes, which are to understand what ASP or ASP.NET are (answer: server-side scripting languages, like PHP).

          References
          On the topic of Model-View-Control

          Friday, October 9, 2009

          Ad networks, exchanges, and auctions

          Definitions
          §         Display advertising: ad formats – videos, images and interactive ads
          §         Reach: of all unique Internet users, the percent that an ad network encounters. Pay attention to whether this is reported in terms of US or globally when comparing one network's reach to another's.
          §         Maximum yield: from the publisher’s perspective, the highest price he can get for each impression.
          §         Direct sales: the publisher sells his ad space directly to the advertiser, in contrast to an ad network, which serves as an intermediary between publishers and advertisers. Note that direct sales has been the prevailing model for selling premium publisher space – e.g., the front page the NYT etc.
          §         Ad slot: defines the size of an ad and the specific location where the ad will appear on a particular page or on a group of similar pages. The name of an ad slot might incorporate: the site name (ESPN), the section within that site (Football), the subsection within that section (e.g., NCAA College Football), page position (e.g., Top, Middle, Center combined with Left, Right, Center), and dimensions (728x90).
          §         Placement: a set of ad slots. It lets you group together slots that a single advertiser (or ad network) might want to simultaneously target.
          §         An insertion order (IO): an agreement between an ad seller and a buyer that specifies the details of the campaign. Contains one or more line items. Serves as the purchase order and contract between the parties. Often includes info such as pricing, impression goals, delivery options, and targeting details.
          §         Line item: an advertiser’s commitment to purchase a specified # of impressions, clicks, or time (CPD) on certain dates at the specified price.
          §         Ad trafficking: done by an ad server; entails providing the code – which downloads and tracks an ad – to the publisher to embed in his site (so that when a user executes that code, the ad will be served to him). I.e., delivery of the ad.
          §         Forecasting: a challenge for publishers. The act of predicting what inventory – in which ads can be placed – you will have in the future.
          §         Ad operations: handles fulfillment of online ad sales. Entails trafficking (the day-to-day execution of campaigns; including ad delivery), inventory management (inventory forecasting), sales, web development (produce and develop the advertising activity).

          See also: Google’s “Introduction to Ad Operations”
          http://services.google.com/training/gamtutorials/Intro_to_Ad_Operations/

          What’s an ad server?
          Ad serving refers to the tasks involved in storing an ad and delivering it on-demand to the specified user. It also entails keeping track of how many times each ad was served, where an ad was served to, how many times the ad was clicked on, and so on. These are all reporting-related activities. An ad server may also perform targeting – wherein the server decides which ad to serve based on the user who will view the ad. In this case, the ad server might store a cookie on every user’s computer (for users who interact with this ad server at some point) and use that cookie to figure out stuff about the user – age, sex, marital status, education level, household income, and so on. That info – along with more mundane facts, such as what operating system a user is running, which browser he’s using and so on – is used as input into the decision about what ad to serve the user.

          What are some more specific features involved at the various stages of ad serving?
          §         To make the size of a bid depend on an ad’s past performance.
          §         To cap the frequency with which an ad is served, the pace at which it’s served, the targeting (to whom it’s served).
          §         Dynamic selection: target only the impressions you want
          §         A centralized clearing system – don’t have to interface directly with each publisher.
          §         Provide targeting, which in this context means: advise the advertiser on which sites he might like to display his graphical ad on. So the ad server doesn’t necessarily facilitate this transaction but he provides guidance into identifying placements that make sense, given the advertiser’s target audience segment, geography, time of day, website content, browser, OS, keywords, …

          There are a number of ways for a publisher to access this functionality:
          1.      Run his own ad server (machine) on his own site premises; e.g., OpenX is an open-source ad server implementation (in the same way that Apache is an implementation of a web server) that a user can download, install, run, and manage in-house. The publisher controls every aspect of ad sales, storage, delivery, and reporting – as well as manages the physical hardware used to perform these tasks.

          2.      Run an instance of OpenX in the cloud. In this case, the publisher retains total control over all aspects of ad serving (sales, storage, delivery, reporting) but doesn’t have to run and maintain the actual physical ad server in his own equipment room.

          3.      Contract with an ad server who provides everything but sales for a fee. That is, the ad server does storage, delivery, and reporting; the publisher takes care of sales. The ad server may provide info about the type of users that a publisher attracts. This info can be used by the publisher in identifying advertisers that are a good fit for his visitors.

          4.      Finally, a publisher can contract with an ad network. In this case, the ad network runs its own ad server (which does storage, delivery, and reporting) as well as handles ad sales. Ad networks were developed to enable an advertiser to simultaneously have his ads shown on various Web sites without having to contract individually with each such site. Google, Yahoo, Microsoft, AOL, Fox Interactive Media, AdBrite, ValueClick and so on are all examples of ad networks. Note that when people talk about ad networks, they generally mean display-ad networks. That is, an ad network is an interface between people who want to show display ads to users and sites who have space available for rent for just that reason.

          This is probably the most common model – in terms of what the highest number of publishers do. It may not be the most common model in terms of revenues; that is, high profile publishers together command the lion’s share of display advertising revenues. And these high profile (or premium) publishers have historically done direct sales of their advertising space – rather than relying on an ad network.


          References:

          Background on Ad Networks
          §         Definition: an intermediary between publishers and advertisers.
          §         Primarily associated with the graphical or display advertising space, rather than the search advertising market.
          §         Ad Networks provide a way for an advertiser’s content to run across the Web – i.e., on lots of websites that are not necessarily under the control of the same administrative domain.
          §         Provide a way to mitigate the effects of evermore audience fragmentation.
          §         An ad network may also serve or host the actual ads that run on these publisher pages; that is, visiting the publisher page causes a request to be sent to the ad network and the ad content to be downloaded from (i.e., served by) that ad network.
          §         Tracks ads.
          §         Reports on the distribution of ads.
          §         Handles the transaction between advertiser and publisher.
          §         Frees publisher up from having to maintain a sales organization (in order to monetize his inventory).
          §         Gets paid on a revenue-sharing basis.

          How do ad networks vary?
          §         What audience they reach: size, demographics, …
          §         The ways in which they can target: vertical, contextual, behavioral, demographic, re-targeting, geographic, site-specific
          §         Type of compensation they accept: CPM, CPC, CPL (Lead), CPA
          §         Ad formats they support: display, text, in-text, video, mobile, in-game, blog, RSS, email, audio/podcast, widgets
          §         Their business model: revenue share, arbitrage, rep firm
          §         The extent to which they provide related services to publishers and advertisers.
          §         The amount and quality of user information they possess.
          To the extent that behavioral marketing takes off, someone like Google has reams of information that they can deploy to better perform such targeted marketing. This is in contrast to an entity such as Microsoft that doesn’t have as much of this kind of info.


          Background on Ad Exchanges
          §         What’s an ad exchange? A trading exchange for display advertising in which website owners and advertisers can reach deals on prices and placement of ads.
          §         A stock exchange for online display ads; i.e., a real-time marketplace with an auction-based system and open bidding process – sells impressions in real-time or on-the-spot (auction platforms are associated with impression-by-impression purchasing).
          §         They can also be used as a futures market – to acquire reserved inventory.
          §         A web site puts up ad space for auction and ad agencies bid for those spots.
          §         What problem does it solve? Managing co-ordinated, large-scale display advertising campaigns across the net is a logistical nightmare. Given that an effective campaign has to span a multitude of display ad formats and thousands of sites, it takes ages to plan and manage campaigns.
          §         Who are the players in the exchange?
          §         The large online publishers participate in the exchange to sell their inventory.
          §         Ad networks (and agencies) are the buyers. So an advertiser contracts with an ad network or ad agency which maintains a network of properties (placements) where the advertiser’s ads can be placed.
          o       The buyer will likely have his own ad serving technology.
          §         You can also buy space on the AdSense publisher network via the exchange.
          §         Similarly, if you’re an AdWords customer (i.e., advertiser), you can buy ad space on the exchange – via your AdWords interface (you accomplish this by electing to have your ads shown on the Google Content Network).
          §         And if you’re an AdSense customer (i.e., publisher), you can sell display space on your site on the exchange – via the AdSense interface (check this out).
          §         “On an exchange, publishers can choose to sell their inventory blind, private, or branded. Which they choose will differ from publisher to publisher, but obviously branded inventory will still be able to command a premium, taking it out of the commodity area. Publishers can also disqualify particular advertisers from whom they do not want to accept advertising (for competitive or other reasons), or make positive qualifications regarding the type of advertisers they will accept (e.g., women's interest only). On the buy side, marketers or their agencies are offered similar control.”

          Benefits of Ad Exchanges
          §         Transparent, dynamic pricing; open bidding.
          §         Simplified, standardized business processes (ad sale/purchase) makes things easier.
          §         Better liquidity for ad inventory.
          §         Smaller advertisers have equal access to exchange inventory as the Big Guys – with bigger ad budgets and better relationships – do. A leveling of the playing field.
          §         Can use technology to automate things such as: bid on any ad space that has properties X, Y, Z and is available for less than $N.
          §         Eliminates intermediaries and their margins.
          §         A publisher gets access to many more advertisers while still retaining control over who can advertise on the publisher’s site.
          §         Buyers can use technology that lets them bid in real-time (based on various criteria and via an automated agent?).

          How do ad exchanges vary?
          §         Their inventory
          §         Their targeting methods
          §         Their placement options
          §         Their pricing models

          Does a publisher usually have an exclusive relationship with a particular ad exchange – so that the publisher cannot offer his space on multiple exchanges simultaneously? Also, are the ad exchanges themselves connected in any way – so that a buyer on one ad exchange can purchase through it property that is listed on another exchange?
          I think a goal is certainly to interconnect the ad exchanges (if this isn’t the case already). Also note that a publisher may have generic deals with advertisers – not for any specific impression but for impressions which match certain criteria. Those generic deals can be compared with real-time bids (obtained in the ad exchange auction) to see which is best and make the sell decision then. In this way, a publisher reconciles an advertiser’s desire to make a futures buy (reserve inventory) with the benefits of a per-impression sale.

          Incorporating AdWords bids into the DoubleClick Ad Exchange
          OK, we’ll walk through an example to hopefully illuminate at least the auction process itself, if not the entire context. Let’s say that there is some slot (i.e., impression) being offered on the DoubleClick Ad Exchange. And that slot has been explicitly targeted by an AdWords advertiser (via a site placement for his display ad) or the slot contextually matches the advertiser’s graphical ad (or industry, whatever). And actually let’s say that the slot matches several AdWords advertisers’ bids: ad_1, ad_2, ad_3, and ad_4.

          Advertiser
          CTR
          # clicks/M
          CPC
          eCPM
          QS
          Ad Rank
          ad_1
          1%
          10
          $3
          $30
          7
          $210
          ad_2
          2%
          20
          $3
          $60
          8
          $480
          ad_3
          5%
          50
          $2
          $100
          6
          $600
          ad_4
          3%
          30
          $3
          $90
          7
          $630

          eCPM is the effective CPM rate and is calculated by identifying the # of clicks per 1000 impressions that an ad is expected to obtain (# clicks/M) then multiplying that value by the cost per click bid. Quality Score is multiplied by eCPM in this case in order to get AdRank (rather than by the CPC). Then we can see that ad_4 wins and ad_3 comes in second place; only these two bids will be considered by the Ad Exchange.

          Advertiser
          Ad Exchange bid
          >= minCPM?
          Publisher’s cut
          Ad Exchange cut
          ad_3
          $580
          Yes
          $493
          $87.00
          ad_4
          $552
          Yes
          $469.20
          $82.80
          minCPM
          $520
          Yes
          $442
          $78.00
          adExch_1
          $375
          No


          adExch_2
          $500
          No


          adExch_3
          $750
          Yes
          $637.50
          $112.50









          First we have to take out the AdWords cut so that the bid represents: what the publisher will make plus what the Ad Exchange gets for the trouble (transaction). Let’s say (HYPOTHETICALLY) that the AdWords revenue share is 8%. Then we have: ad_4 == $630 * 0.92 == $579.60 (let’s call it $580) and ad_3 == $600 * 0.92 == $552. Naturally, taking out the AdWords revenue share doesn’t change the relative ordering between these two ads – because we assume that revenue share is the same (rate) for everyone. Now let’s combine those bids with other Ad Exchange bids and with the publisher’s min CPM.

          So now we’re considering the most lucrative AdWords ads (ad_3 and ad_4) alongside of the publisher’s minimum CPM and any Ad Exchange bids for that slot (impression). The next thing we do is subtract the publisher’s cut from each bid (which we’re assuming to be 85% across the board; rumor has it that 50% is a more likely estimate) – so that we’re left with what the DoubleClick Ad Exchange stands to make on the deal. THEN we take into account that the publisher may have other bids for this space – that were obtained by his own ad-sales team. You can think of these as commitments that the publisher has made to various advertisers – but these commitments can be fulfilled within a time window (rather than being a real-time thing –specific to a single impression). So the publisher could always use one of those bids. And, if one of those bids is better than all of the bids from the Ad Exchange (including the AdWords bids) – which are all represented as what the Exchange stands to make on the deal – then the publisher should place serve this in-house ad in this impression.

          So the DoubleClick Ad Exchange interfaces with the publisher’s own ad server to obtain that ad server’s bids for these slots. Those bids are specified in terms of what the Ad Exchange would make on the transaction and hence can be compared (apples-to-apples) to the values in the final column of the above table. So then we’re looking at:

          Advertiser
          DoubleClick Exchange cut
          ad_3
          $87.00
          ad_4
          $82.80
          minCPM
          $78.00
          adExch_3
          $112.50
          inHouse_1
          $96.00
          inHouse_2
          $73.25
          inHouse_3
          $101.50

          So in this case, the winner of the slot was one of the bidders from the Ad Exchange. If the publisher has “dynamic allocation,” then this transaction will complete automatically (without requiring any approval) and the ad from the adExch_3 advertiser will be shown to the Web site visitor whose impression triggered this whole execution cycle.

          Note that a publisher doesn’t necessarily have any in-house bids (i.e., bids from “the publisher’s own ad server) – i.e., its direct sales channel is empty for this slot. Note also that it appears then that the final decision about which bid wins is made in order to optimize the DoubleClick Exchange’s cut. But – so long as the publisher’s cut is the same across all advertisers – this is the same as getting the maximum cut possible for the publisher. The rate is the same across all advertisers and hence the winning bid will be the one with the largest principle – of which the cuts are being taken.