软件体系结构5aATC案例分析课件.ppt
1,案例分析:Air Traffic Control,张平健华南理工大学软件学院,2,Air Traffic Control (ATC),The problem is to control a very large number of aircraft from take-off to landing.Problem features:Hard real time no tolerance for missing deadlinesUltra High availabilitySafety criticalHighly distributed,3,Flying from point A to point B in the U.S. air traffic control system,4,En route centers in the United States,5,Flight Monitoring,Flight from Key West to DCKey west ground control (to taxi to runway)Key West Tower (take off till leaving airport airspaceZMA enroute zone centerZJX enroute zone centerZTL enroute zone centerZDC enroute zone centerDC Tower (arrival airport)ground-control (to taxi again)Advanced Automation System (AAS) ComponentsGround ControlAirport TowerEn Route Centers Initial Sector Suite System (ISSS)This study will focus on ISSS only.,6,ISSS Influences,ISSS was only one part of AASNotes on Design of ISSSMany components in commonInterfaces to: radio systems, flight-plan DB, each otherCommon quality requirements for availability, reliability So ISSS was influenced by requirements for all of AASHistoryISSS real system, designed, most of code developedNot deployed, scaled back to more economical, more staged solution (budget cuts)Outside Audit the architecture and design were analyzed by an independent audit team that judged “satisfies requirements.”The system deployed borrowed heavily from ISSShttp:/,7,ABC of the Air Traffic Control System,8,Requirements and Quality Attributes,ATC system is highly visible with enormous commercial, governmental and public interestGreat potential for loss of life and costly property.Thus the two most important quality attributes were:Ultrahigh availabilityEssential that “unavailability” limited to very short periodsAvailability requirement .99999: unavailable less than 5 minutes in a year;however short recover periods ( 10 sec) did not countHigh performanceHandle up to 2440 aircrafts effectively and efficiently,9,Other Requirements and Quality Attributes,Openness- meaning the system needs to be able to incorporate commercially developed componentsAbility to field subsets of the systemModifiability modifications to functionality and to handle upgrades in hardware and softwareInteroperability the ability to operate with and interface a wide range of external systems,10,Stakeholders,FAAControllers (end users) could reject this system if it was not to their liking even if it met all functional requirementsUsability attribute?Actually handled by taking great care with requirements and design (thus slowing the process),11,Sector Suites,Sector Suites a suite of air-traffic controllers each with their own console that collectively handle all the aircraft in the sectorSectors could be defined differently at each centerCould be done physicallyCould be done to balance the loadLess densely traveled sectors could be made largerPlanes are passed off fromDeparture airport - en route zone center - arrival airportAlso within zone: sector - sector - - sector before passing to the next center,12,ISSS Design,ISSS requires flexibility in number of control stations per sector (1 to 4)At least two controllers per sector:1. Radar controllerMonitors radarCommunicates with aircraftResponsible for maintaining separation of aircraft2. Data controllerRetrieves flight plans etc.Supplies radar controller with “intentions” of aircraft,13,ISSS Implementation Metrics,The system contains about 1 million lines of Ada codeDesigned to support up to 210 consoles per en route center. Each console was a workstation with IBM RS/6000 processorRequirements to handle from 400 to 2440 aircraft simultaneouslyThere may be from 16 to 40 radar units to support a single facility A center may have from 60 to 90 control positions in each center,14,ISSS Functionality Summary,Acquire radar targets reports from existing ATC system, the Host Computer System (henceforth “Host”)Convert radar reports for display and broadcast to all consoles (consoles can switch areas that are displayed)Handle conflict alerts (potential collisions)Interface with Host for input and to retrieve flight plansProvide extensive monitoring of the system itself to allow dynamic reconfigurationProvide recording capability for later playbackProvide nice GUIProvide reduced backup capability in the event of the failure of the Host, the primary network, the primary radar sensors,15,ISSS Architecture,Views1. Physical View2. Module decomposition view3. Process View4. Client-Server View5. Code View6. Layered View7. Fault Tolerance View,16,Physical View,17,Physical View Notes,HCS A Host computer System A (primary)Processes radar and flight-plan info.Output to consoles (radar) and flight-strip printers (flight-plans)HCS B backup HostCommon Consoles the workstationsLocal Communications Network Consoles HostsEach host has two LCN interface units called LIU-HLCN composed of 4 parallel token ring networks1. One supports broadcast of radar info2. One for point-to-point between workstations3. One provides for recording data for later playback4. A spare,18,Physical View Notes,Backup Communication Network (BCN) is an Ethernet using TCP/IPBoth LCN and BCN have monitor and control consoles for maintenance personnel Enhance Direct Access Radar Channel (EDARC) provides backup display of info in case of loss of Host. EDARC supplies raw data to the External System Interface (EIS) processorCentral processors mainframes that provided record and playback for early version of ISSSTesting and training subsystem allow training of new personnel and testing of new equipment without interfering,19,Module Decomposition View,Elements called Computer Software Configuration Items (CSCIs) as required by the government software development standard required by the customer5 CSCIs:1. Display Management2. Common Systems ServicesGeneral ATC utilities; ISSS is 1/3 of AAS3. Recording, analysis and playback4. National Airspace System ModificationModifying software on host5. IBM AIX operating system,20,Module Decomposition View: Tactics,The CSCIs formed deliverable units software and documentationTactics:Semantic coherence main one guiding the well-defined and non-overlapping decompositionAbstract common services Common System Services ModuleRecord/playback tactics - testability Generalizing module well designed interfaces,21,Process View,Concurrency resides in “applications”, roughly processes in Dijkstras cooperating sequential processes Ada Main unit a process schedulable by OSISSS designed to work on more than one processorProcessors grouped into “processor groups”Critical to fault tolerance and thus availabilityOne primary, the rest backupPAS primary address spaceSAS standby address spaceOperational unit the collection of primary and its standbysFunction groups are the components not implemented in this fault tolerant fashion (replicated on several groups),22,Process view,23,Primary Failure Switchover,1. PAS fails2. A standby system SAS is promoted to PAS3. The new PAS sends messages notifying of the failure and starts providing all services4. A new SAS is started up to replace to old failed PAS5. The new SAS sends message to notify the new PAS6. Adding an new operational unit is similar but more complexstate resynchronization and passive redundancy,24,Adding a new Operational Unit,1. Identify necessary input data and its location.2. Identify where (which Operation Unit / FG) to send output3. Fit operational units communication patterns into system wide acyclic graph such that it remains acyclic and deadlocks will not occur.4. Design messages to achieve this.5. Identify internal state data that must be used for check-pointing. (must be included in PAS - SASs)6. Define messages: message types, data7. Plan for switchover on failure; test for consistency8. Ensure processing steps complete within a heartbeat9. Plan data-sharing and synchronization with other Operational Units,25,C/S View,26,Client-Server View,Communication between PAS elements within operational units (client and server)The client sends a “service request message”The server acknowledges and responds with resultsWithin operational units PASs send updated state to SASsWithin FGs nothing extra just ACK and results,27,Code View,Code view describes how functionality is mapped into code unitsISSS Code viewAda main programSubprograms grouped into packages (separately compilable)Ada program consists of one or more tasks (threads)Applications (operational units and functional groups) decomposed into Ada packages,28,Layered View,Shared memory (Tables and Message Storage)AAS applicationShared Memory (Tables and Message Storage)CAS AIX Kernel ExtensionAIX Kernel,29,30,Fault Tolerance View,M&C consoleGlobal Availability ManagerLocal/Group Availability ManagerATC consoleApplication Software Operational Unit (Thread Processing Model)OS extensions Address Space ModelsNetworkOperating SystemProcessorI/O devices,31,component-and-connector view for fault tolerance,32,Fault Tolerance Hierarchy,Each level of the hierarchyDetects errors in itself, peers, and all lower levelsHandles exceptions from lower levelsDiagnoses, recovers, reports or raises exceptionsLevels from Top to BottomSystem monitor and controlGlobal availability managerGroup availability managerLocal availability managerApplicationRuntime environmentOperating SystemPhysical level: processors, networks, devices,33,Fault Tolerance Hierarchy,Fault Detection at each level byBuilt-in testsEvent time-outsNetwork circuit testsGroup membership protocolsHuman reaction to alarmsFault recovery can be automatic or manualFor availability managers recovery is decision table drivenIn a PAS there are 4 types of recovery1. In a switchover the SAS takes over for the old PAS2. A warm restart uses checkpoint data saved to non-volatile memory3. Cold restart uses default start-up data4. A cutover is used to transition to new logic or data,34,Fault Tolerance Hierarchy,Fault tolerance of the hardware is done via redundancyLCN, BCN, various bridgesBackup radar and separate channel for itProcessor hardware replicated within processor groupTactics added here component availability used for fault tolerance“Ping/echo”“Heartbeat”“Exception” to transfer errors to the correct place“spare” to perform recovery,35,Relating the Views,Additional insight is provided by examining relationships between viewsMapping one view to anotherIn ISSSCSCIs are the elements in the module decomposition view (composed of applications)Applications (processes) are the elements in the process view and in the client-server viewApplications are implemented in Ada packages and programs elements of the Code viewApplications are turned into threads at runtime elements of the concurrency viewThe special quality attribute view (fault-tolerance) uses elements from the process, layer and module views,36,“Configuration Files” Tactic,ISSS makes extensive use of the modifiability tactic “configuration files” (called this adaptation data).Site-specific data allows configuration of ISSS for each of the 22 en route centersThis configuration is fairly extensive and powerfulE.g., splitting an ATC console window into two“generalize the module” tactic Negative sideIt takes powerful interpretation mechanism to support this level of adaptability at run-timeIt therefore is complex to maintain the mechanism if changes are required there.Different configurations substantially complicates testing.,37,“Abstract Common Services” Tactic,PAS and SAS really comes from the same sourceNo difference in the codeJust dynamic state boolean variable “primaryStatus”Code Template Structure for all operation units“Abstracting Common Services” tacticCommon part is abstracted to template,38,Code Template affects other Tactics,Other modifiability tactics addressed by code template“anticipation of expected changes”“Semantic coherence”“generalizing the module”Making interfaces part of the template “maintain interface stability” and “adherence to defined protocols”,39,40,41,42,ISSS Summary,Architectural solutions can be the key to achieving the needs of an application (especially quality attribute requirements)High availability : fault toleranceLongevity : high modifiability, interoperabilityAudit of ISSS before abandoning,