1 2 3 Previous Next

The Data Stack

1,483 posts

In enterprise IT and service provider environments these days, you’re likely to hear lots of discussion about software-defined infrastructure. In one way or another, everybody now seems to understand that IT is moving into the era of SDI.

 

There are good reasons for this transformation, of course. SDI architectures enable new levels of IT agility and efficiency. When everything is managed and orchestrated in software, IT resources— including compute, storage, and networking—can be provisioned on demand and automated to meet service-level agreements and the demands of a dynamic business.map-graphic.jpg

 

For most organizations, the question isn’t, “Should we move to SDI?” It’s, “How do we get there?” In a previous post, I explored this topic in terms of a high road that uses prepackaged SDI solutions, a low road that relies on build-it-yourself strategies, and a middle road that blends the two approaches together.

 

In this post, I will offer up a maturity-model framework for evaluating where you are in your journey to SDI. This maturity model has five stages in the progression from traditional hard-wired architecture to software-defined infrastructure. Let’s walk through these stages.

 

Standardized

 

At this stage of maturity, the IT organization has standardized and consolidated servers, storage systems, and networking devices. Standardization is an essential building block for all that follows. Most organizations are already here.

 

Virtualized

 

By now, most organizations have leveraged virtualization in their server environments. While enabling high level of consolidation and greater utilization of physical resources, server virtualization accelerates service deployment and facilitates workload optimization. The next step is to virtualize storage and networking resources to achieve similar gains.

 

Automated

 

At this stage, IT resources are pooled and provisioned in an automated manner. In a step toward a cloud-like model, automation tools enable the creation of self-service provisioning portals—for example, to allow a development and test team to provision its own infrastructure and to move closer to a frictionless IT organization.

 

Orchestrated

 

At this higher stage of IT maturity, an orchestration engine optimizes the allocation of data center resources. It collects hardware platform telemetry data and uses that information to place applications on the best servers, with features that enable acceleration of the workloads, located in approved locations for optimal performance and the assigned levels of trust. The orchestration engine acts as an IT watchdog that spots performance issues and takes remedial actions—and then learns from the events to continue to meet or exceed the customer’s needs.

 

SLA Managed

 

At this ultimate stage—the stage of the real-time enterprise—an organization uses IT service management software to maintain targeted service levels for each application in a holistic manner. Resources are automatically assigned to applications to maintain SLA compliance without manual intervention. The SDI environment makes sure the application gets the infrastructure it needs for optimal performance and compliance with the policies that govern it.

 

In subsequent posts, I will take a closer look at the Automated, Orchestrated, and SLA Managed stages. For now, the key is to understand where your organization falls in the SDI maturity model and what challenges need to be solved in order to take this journey. This understanding lays the groundwork for the development of strategies that move your data center closer to SDI—and the data center of the future.

Every disruptive technology in the data center forces IT teams to rethink the related practices and approaches. Virtualization, for example, led to new resource provisioning practices and service delivery models.

 

Cloud technologies and services are driving similar change. Data center managers have many choices for service delivery, and workloads can be more easily shifted between the available compute resources distributed across both private and public data centers.

 

Among the benefits stemming from this agility, new approaches for lowering data center energy costs have many organizations considering cloud alternatives.

 

Shifting Workloads to Lower Energy Costs

 

Every data center service and resource has an associated power and cooling cost. Energy, therefore, should be a factor in capacity planning and service deployment decisions. But many companies do not leverage all of the energy-related data available to them – and without this knowledge, it’s challenging to make sense of information being generated by servers, power distribution, airflow and cooling units and other smart equipment.

 

That’s why holistic energy management is essential to optimizing power usage across the data center. IT and facilities can rely more on user-friendly consoles to gain a complete picture of the patterns that correlate workloads and activity levels to power consumption and dissipated heat like graphical thermal and power maps of the data center. Specific services and workloads can also be profiled, and logged data helps build a historical database to establish and analyze temperature patterns. Having one cohesive view of energy consumption also reduces the need to rely on less accurate theoretical models, manufacturer specifications or manual measurements that are time consuming and quickly out of date.

 

A Case for Cloud Computing

 

This makes the case for cloud computing as a means to manage energy costs. Knowing how workload shifting will decrease the energy requirements for one site and increase them for another makes it possible to factor in the different utility rates and implement the most energy-efficient scheduling. Within a private cloud, workloads can be mapped to available resources at the location with the lowest energy rates at the time of the service request. Public cloud services can be considered, with the cost comparison taking into account the change to the in-house energy costs.

 

From a technology standpoint, any company can achieve this level of visibility and use it to take advantage of the cheapest energy rates for the various data center sites. Almost every data center is tied to at least one other site for disaster recovery, and distributed data centers are common for a variety of reasons. Add to this scenario all of the domestic and offshore regions where Infrastructure-as-a-Service is booming, and businesses have the opportunity to tap into global compute resources that leverage lower-cost power and in areas where infrastructure providers can pass through cost savings from government subsidies.

 

Other Benefits of Fine-Grained Visibility

 

For the workloads that remain in the company’s data centers, increased visibility also arms data center managers with knowledge that can drive down the associated energy costs. Energy management solutions, especially those that include at-a-glance dashboards, make it easy to identify idle servers. Since these servers still draw approximately 60 percent of their maximum power requirements, identifying them can help adjust server provisioning and workload balancing to drive up utilization.

 

Hot spots can also be identified. Knowing which servers or racks are consistently running hot can allow adjustments to the airflow handlers, cooling systems, or workloads to bring the temperature down before any equipment is damaged or services disrupted.

 

Visibility of the thermal patterns can be put to use for adjusting the ambient temperature in a data center. Every degree that temperature is raised equates to a significant reduction in cooling costs. Therefore, many data centers operate at higher ambient temperatures today, especially since modern data center equipment providers warrant equipment for operation at the higher temperatures.

 

Some of the same energy management solutions that boost visibility also provide a range of control features. Thresholds can be set to trigger notification and corrective actions in the event of power spikes, and can even help identify the systems that will be at greatest risk in the event of a spike. Those servers operating near their power and temperature limits can be proactively adjusted, and configured with built-in protection such as power capping.

 

Power capping can also provide a foundation for priority-based energy allocations. The capability protects mission-critical services, and can also extend battery life during outages. Based on knowledge extracted from historical power data, capping can be implemented in tandem with dynamic adjustments to server performance. Lowering clock speeds can be an effective way to lower energy consumption, and can yield measurable energy savings while minimizing or eliminating any discernable degradation of service levels.

 

Documented use cases for real-time feedback and control features such as thresholds and power capping prove that fine-grained energy management can yield significant cost reductions. Typical savings of 15 to 20 percent of the utility budget have been measured in numerous data centers that have introduced energy and temperature monitoring and control.

 

Understand and Utilize Energy Profiles

 

As the next step in the journey that began with virtualization, cloud computing is delivering on the promises for more data center agility, centralized management that lowers operating expenses, and cost-effectively meeting the needs for very fast-changing businesses.

 

With an intelligent energy management platform, the cloud also positions data center managers to more cost-effectively assign workloads to leverage lower utility rates in various locations. As energy prices remain at historically high levels, with no relief in sight, this provides a very compelling incentive for building out internal clouds or starting to move some services out to public clouds.

 

Every increase in data center agility, whether from earlier advances such as virtualization or the latest cloud innovations, emphasizes the need to understand and utilize energy profiles within the data center. Ignoring the energy component of the overall cost can hide a significant operating expense from the decision-making process.

The industry continues to advance the iWARP specification for RDMA over Ethernet, first ratified by the Internet Engineering Task Force (IETF) in 2007.

 

This article in Network World, “iWARP Update Advances RDMA over Ethernet for Data Center and Cloud Networks,” co-authored by myself and Wael Noureddine of Chelsio Communications, describes two new extensions that have been added to help software developers of RDMA code by aligning iWARP more tightly with RDMA technologies that are based on the InfiniBand network and transport, i.e., InfiniBand itself and RoCE. By bringing these technologies into alignment, we move closer toward the goal of the Open Fabrics Alliance, that the application developer need not concern herself with which of these is the underlying network technology -- RDMA will "just work" on all.

 

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

 

*Other names and brands may be claimed as the property of others.

The international melting pot of Vancouver, BC provides a perfect background for the OpenStack Summit, a semi-annual get together of the developer community driving the future of open source in the data center. After all, it takes a melting pot of community engagement to build “the open source operating system for the cloud”. I came to Vancouver to get the latest from that community. This week’s conference has provided an excellent state of the union on where OpenStack is on delivering its vision to be the operating system for the cloud, how both the industry and user community are working to innovate on top of OpenStack to drive the ruggedness required for enterprise and telco deployments, and where gaps still exist between industry vision and deployment reality.  This state of the union is delivered across Summit keynotes, hundreds of track sessions, demos, and endless meetings, meet ups and other social receptions.

rule_the_stack_view.jpg

 

Intel’s investment in OpenStack reflects the importance of open source software innovation to deliver our vision of Software Defined Infrastructure.  Our work extends from our core engagement as a leader in the OpenStack Foundation, projects focused on ensuring that software is taking full advantage of Intel platform features to drive higher levels of security, reliability and performance, and collaborations driven to ensure that demos of today become mainstream deployments tomorrow.

 

So what’s new from Intel this week? Today, Intel announced Clear Containers, a project associated with Intel Clear Linux designed to ensure that container based environments leverage Intel virtualization and security features to both improve speed of deployment and enable a hardware root of trust to container workloads.  We also announced the beta delivery of Cloud Integration Technology 3.0, our latest software aimed delivering workload attestation across cloud environments, and showcased demos ranging from trusted VMs to intelligent workload scheduling to NFV workloads on trusted cloud architecture.


To learn more about Intel’s engagement in the OpenStack community, please check out a conversation with Jonathan Donaldson as well as learn about Intel’s leadership on driving diversity in the data center as seen through the eyes of some leading OpenStack engineers.


Check back tomorrow to hear more about the latest ecosystem and user OpenStack innovation as well as my perspectives on some of the challenges ahead for industry prioritization.  I’d love to hear from you about your perspective on OpenStack and open source in the data center. Continue the conversation here, or reach out @techallyson.

Cloud computing offers what every business wants: the ability to respond instantly to business needs. It also offers what every business fears: loss of control and, potentially, loss of the data and processes that enable the business to work. Our announcement at the OpenStack Summit of Intel® Cloud Integrity Technology 3.0 puts much of that control and assurance back in the hands of enterprises and government agencies that rely on the cloud.

 

Through server virtualization and cloud management software like OpenStack, cloud computing lets you instantly, even automatically, spin up virtual machines and application instances as needed. In hybrid clouds, you can supplement capacity in your own data centers by "bursting" capacity from public cloud service providers to meet unanticipated demand. But this flexibility also brings risk and uncertainly. Where are the application instances actually running? Are they running on trusted servers whose BIOS, operating systems, hypervisors, and configurations have not been tampered with? To assure security, control, and compliance, you must be sure applications run in a trusted environment. That's what Intel Cloud Integrity Technology lets you do.

 

Intel Cloud Integrity Technology 3.0 is software that enhances security features of Intel® Xeon® processors to let you assure applications running in the cloud run on trusted servers and virtual machines whose configurations have not been altered. Working with OpenStack, it ensures when VMs are booted or migrated to new hardware, the integrity of virtualized and non-virtualized Intel x86 servers and workloads is verified remotely using Intel® Trusted Execution Technology (TXT) and Trusted Platform Module (TPM) technology on Intel Xeon processors. If this "remote attestation" finds discrepancies with the server, BIOS, or VM —suggesting the system may have been compromised by cyber attack—the boot process can be halted. Otherwise, the application instance is launched in a verified, trusted environment spanning the hardware and the workload.

 

In addition to assuring the integrity of the workload, Cloud Integrity Technology 3.0 also enables confidentially by encrypting the workload prior to instantiation and storing it securely using OpenStack Glance. An included key management system that you deploy on premise gives the tenant complete ownership and control of the keys used to encrypt and decrypt the workload.

 

Cloud Integrity Technology 3.0 builds on earlier releases to assure a full chain of trust from bare metal up through VMs. It also provides location controls to ensure workloads can only be instantiated in specific data centers or clouds. This helps address the regulatory compliance requirements of some industries (like PCI and HIPAA) and geographical restrictions imposed by some countries.

 

What we announced at OpenStack Summit is a beta availability version of Intel Cloud Integrity Technology 3.0. We'll be working to integrate with an initial set of cloud service providers and security vendor partners before we make the software generally available. And we'll submit extensions to OpenStack for Cloud Integrity Technology 3.0 later this year.

 

Cloud computing is letting businesses slash time to market for new products and services and respond quickly to competitors and market shifts. But to deliver the benefits promised, cloud service providers must assure tenants their workloads are running on trusted platforms and provide the visibility and control they need for business continuity and compliance.

 

Intel Xeon processors and Cloud Integrity Technology are enabling that. And with version 3.0, we're enabling it across the stack from the hardware through the workload. We're continuing to extend Cloud Integrity Technology to storage and networking workloads as well: storage controllers, SDN controllers, and virtual network functions like switches, evolved packet core elements, and security appliances. It's all about giving enterprises the tools they need to capture the full potential of cloud computing.

By Tony Dempsey


I’m here attending the OpenStack Summit in Vancouver, BC and wanted to find out more about OPNFV, a cross industry initiative to develop a reference architecture for operators to use as a reference for their NFV deployments. Intel is a leading contributor to OPNFV, and I was keen to find out more, so I attended a special event being held as part of the conference.

 

Heather Kirksey (OPNFV Director) kicked off today’s event by describing what OPNFV is all about, including the history around why OPNFV was formed as well as an overview on what areas OPNFV is focused on. OPNFV is a carrier-grade integrated open source platform to accelerate the introduction of new NFV products and services, which was an initiative coming out of the ETSI SIG group and its initial focus is on the NFVI layer.

 

OPNFV’s first release will be called Arno (naming is themed on names of rivers) and will include OpenStack, OpenDaylight, and Open vSwitch.  No date for the release is available just yet but is thought to be soon. Notably, Arno is expected to be used in lab environments initially, versus a commercial deployment. High Availability (HA) will be part of the first release (control and deployment side is supported). The plan is to make OpenStack Telco-Grade instead of trying to make a Telco-Grade version of OpenStack. AT&T gave an example as to how they were going to use the initial Arno release.  As an example of how this release will be implemented, AT&T indicated they going to bring the Arno release into their lab, add additional elements to it, and test for performance and security. They see this release very much as a means to uncover gaps in open source projects, help identify fixes and upstream these fixes. OPNFV is committed to working with the upstream communities to ensure a good relationship.  Down the road it might be possible for OPNFV releases to be deployed by service providers but currently this is a development tool.

 

An overview on OPNFV’s Continuous Integration (CI) activities was given along with a demo. The aim of the CI activity is to give fast feedback to developers in order to increase and improve the rate at, which software is developed. Chris Price (TSC Chair) spoke about requirements for the projects and working with upstream communities. According to Chris, OPNFV’s focus is working with the open source projects to define the issues, understand which open source community can likely solve the problem, work with that community to find a solution, and then upstream that solution. Mark Shuttleworth (founder of Canonical) gave an auto-scaling demo showing a live VIMS core (from Metaswitch) with CSCF auto-scaling running on top of Arno. 

 

I will be on the lookout for more OPNFV news throughout the Summit to share. In the meantime, check out Intel Network Builders for more information on Intel’s support of OPNFV and solutions delivery from the networking ecosystem.

By Suzi Jewett, Diversity & Inclusion Manager, Data Center Group, Intel

 

I have the fantastic job of driving diversity and inclusion strategy for the Data Center Group at Intel.  For me it is the perfect opportunity to align my skills, passions, and business imperatives in a full time role.  I have always had the skills and passions, but it was not until recently that the business imperative portion grew within the company to a point that we needed a full time person to fill this role and many similar throughout Intel.  Being a female mechanical engineer I have always known I am one of the few and at times that was awkward, but even I didn’t know the business impact of not having diverse teams.IMG_2216.JPG


Over the last 2-3 years the information on the bottom line results to the business of having diverse persons on teams and in leadership positions has become clear, and has provided overwhelming evidence that says that we can no longer be okay with having a flat or dwindling diverse persons representation in our teams.  We also know that all employees actually have more passion for their work and are able to bring their whole-selves to work when we have an inclusive environment.  Therefore, we will not achieve the business imperatives we need to unless we embrace diverse backgrounds, experiences, and thoughts in our culture and in our every decision.

 

Within the Data Center Group one area that we recognize as well below where we need it to be is female participation in open source technologies. So, I decided that we should host a networking event for women at the OpenStack Summit this year and really start making our mark in increasing the number of women in the field.

 

Today I had my first opportunity to interact with people working in OpenStack at the Women of OpenStack Event. We had a beautiful cruise around the Vancouver Harbor and then chatted the night away at Black + Blue Steakhouse. About 125 women attended and a handful of male allies (yeah!). The event was put on by the OpenStack foundation and sponsored by Intel & IBM. The excitement of women there and the non-stop conversation was so energizing to be a part of and it was obvious that the women loved having some kindred spirits to talk tech and talk life with. I was able to learn more about how OpenStack works, why it’s important, and see the passion of everyone in the room to work together to make it better. I learned that many of the companies design features together, meeting weekly and assigning ownership to divvy up the work between the companies to complete feature delivery to the code…being new to open source software I was amazed that this is even possible and excited at the same to see the opportunities to really have diversity in our teams because the collaborative design has the opportunity to bring in a vast amount of diversity and create a better end product.

 

IMG_2218.JPG

 

A month or so ago I got asked to help create a video to be used today to highlight the work Intel is doing in OpenStack and the importance to Intel and the industry of having women as contributors. The video was shown tonight along with a great video from IBM and got lots of applause and support throughout the venue as different Intel women appeared to talk about their experiences. Our Intel ‘stars’ were a hit and it was great to have them be recognized for their technical contributions to the code and leadership efforts for Women of OpenStack. What’s even more exciting is that this video will play at a keynote this week for all 5000 attendees to highlight what Intel is doing to foster inclusiveness and diversity in OpenStack!

 

By Mike Pearce, Ph.D. Intel Developer Evangelist for the IDZ Server Community

 

 

On May 5, 2015, Intel Corporation announced the release of its highly anticipated Intel® Xeon® processor E7 v3 family.  One key area of focus for the new processor family is that it is designed to accelerate business insight and optimize business operations—in healthcare, financial, enterprise data center, and telecommunications environments—through real-time analytics. The new Xeon processor is a game-changer for those organizations seeking better decision-making, improved operational efficiency, and a competitive edge.

 

The Intel Xeon processor E7 v3 family’s performance, memory capacity, and advanced reliability now make mainstream adoption of real-time analytics possible. The rise of the digital service economy, and the recognized potential of "big data," open new opportunities for organizations to process, analyze, and extract real-time insights. The Intel Xeon processor E7 v3 family tames large volumes of data accumulated by cloud-based services, social media networks, and intelligent sensors, and enable data analytics insights, aided by optimized software solutions.

 

A key enhancement to the new processor family is its increased memory capacity – the industry’s largest per socket1 - enabling entire datasets to be analyzed directly in high-performance, low-latency memory rather than traditional disk-based storage. For software solutions running on and/or optimized for the new Xeon processor family, this means businesses can now obtain real-time analytics to accelerate decision-making—such as analyzing and reacting to complex global sales data in minutes, not hours.  Retailers can personalize a customer’s shopping experience based on real-time activity, so they can capitalize on opportunities to up-sell and cross-sell.  Healthcare organizations can instantly monitor clinical data from electronic health records and other medical systems to improve treatment plans and patient outcomes.

 

By automatically analyzing very large amounts of data streaming in from various sources (e.g., utility monitors, global weather readings, and transportation systems data, among others), organizations can deliver real-time, business-critical services to optimize operations and unleash new business opportunities. With the latest Xeon processors, businesses can expect improved performance from their applications, and realize greater ROI from their software investments.

 

 

Real Time Analytics: Intelligence Begins with Intel

 

Today, organizations like IBM, SAS, and Software AG are placing increased emphasis on business-intelligence (BI) strategies. The ability to extract insights from data is a something customers expect from their software to maintain a competitive edge.  Below are just a few examples of how these firms are able to use the new Intel Xeon processor E7 v3 family to meet and exceed customer expectations.

 

Intel and IBM have collaborated closely on a hardware/software big data analytics combination that can accommodate any size workload. IBM DB2* with BLU Acceleration is a next-generation database technology and a game-changer for in-memory computing. When run on servers with Intel’s latest processors, IBM DB2 with BLU Acceleration optimizes CPU cache and system memory to deliver breakthrough performance for speed-of-thought analytics. Notably, the same workload can be processed 246 times faster3 running on the latest processor than the previous version of IBM DB2 10.1 running on the Intel Xeon processor E7-4870.

 

By running IBM DB2 with BLU Acceleration on servers powered by the new generation of Intel processors, users can quickly and easily transform a torrent of data into valuable, contextualized business insights. Complex queries that once took hours or days to yield insights can now be analyzed as fast as the data is gathered.  See how to capture and capitalize on business intelligence with Intel and IBM.

 

From a performance speed perspective, Apama* streaming analytics have proven to be equally impressive. Apama (a division of Software AG) is an extremely complex event process engine that looks at streams of incoming data, then filters, analyzes, and takes automated action on that fast-moving big data. Benchmarking tests have shown huge performance gains with the newest Intel Xeon processors. Test results show 59 percent higher throughput with Apama running on a server powered by the Intel Xeon processor E7 v3 family compared to the previous-generation processor.4

 

Drawing on this level of processing power, the Apama platform can tap the value hidden in streaming data to uncover critical events and trends in real time. Users can take real-time action on customer behaviors, instantly identify unusual behavior or possible fraud, and rapidly detect faulty market trades, among other real-world applications. For more information, watch the video on Driving Big Data Insight from Software AG. This infographic shows Apama performance gains achieved when running its software on the newest Intel Xeon processors.

 

SAS applications provide a unified and scalable platform for predictive modeling, data mining, text analytics, forecasting, and other advanced analytics and business intelligence solutions. Running SAS applications on the latest Xeon processors provides an advanced platform that can help increase performance and headroom, while dramatically reducing infrastructure cost and complexity. It also helps make analytics more approachable for end customers. This video illustrates how the combination of SAS and Intel® technologies delivers the performance and scale to enable self-service tools for analytics, with optimized support for new, transformative applications. Further, by combining SAS* Analytics 9.4 with the Intel Xeon processor E7 v3 family and the Intel® Solid-State Drive Data Center Family for PCIe*, customers can experience throughput gains of up to 72 percent. 5

 

The new Intel Xeon processor E7 v3 processor’s ability to drive new levels of application performance also extends to healthcare. To accelerate Epic* EMR’s data-driven healthcare workloads and deliver reliable, affordable performance and scalability for other healthcare applications, the company needed a very robust, high-throughput foundation for data-intensive computing. Epic’s engineers benchmark-tested a new generation of key technologies, including a high performance data platform from InterSystem*, new virtualization tools from VMware*, and the Intel Xeon processor E7 v3 family. The result was an increase in database scalability of 60 percent,6, 7 a level of performance that can keep pace with the rising data access demands in the healthcare enterprise while creating a more reliable, cost-effective, and agile data center. With this kind of performance improvement, healthcare organizations can deliver increasingly sophisticated analytics and turn clinical data into actionable insight to improve treatment plans and ultimately, patient outcomes.

 

These are only a handful of the optimized software solutions that, when powered by the latest generation of Intel processors, are enabling tremendous business benefits and competitive advantage. From the highly improved performance, memory capacity, and scalability, the Intel Xeon E7 v3 processor family helps deliver more sockets, heightened security, increased data center efficiency and the critical reliability to handle any workload, across a range of industries, so that your data center can bring your business’s best ideas to life. To learn more, visit our software solutions page and take a look at our Enabled Applications Marketing Guide.

 

 

 

 

 

 

1 Intel Xeon processor E7 v3 family provides the largest memory footprint of 1.5 TB per socket compared to up to 1TB per socket delivered by alternative architectures, based on published specs.

2 Up to 6x business processing application performance improvement claim based on SAP* OLTP internal in-memory workload measuring transactions per minute (tpm) on SuSE* Linux* Enterprise Server 11 SP3. Configurations: 1) Baseline 1.0: 4S Intel® Xeon® processor E7-4890 v2, 512 GB memory, SAP HANA* 1 SPS08. 2) Up to 6x more tpm: 4S Intel® Xeon® processor E7-8890 v3, 512 GB memory, SAP HANA* 1 SPS09, which includes 1.8x improvement from general software tuning, 1.5x generational scaling, and an additional boost of 2.2x for enabling Intel TSX.

3 Software and workloads used in the performance test may have been optimized for performance only on Intel® microprocessors. Previous generation baseline configuration: SuSE Linux Enterprise Server 11 SP3 x86-64, IBM DB2* 10.1 + 4-socket Intel® Xeon® processor E7-4870 using IBM Gen3 XIV FC SAN solution completing the queries in about 3.58 hours.  ‘New Generation’ new configuration: Red Hat* Enterprise LINUX 6.5, IBM DB2 10.5 with BLU Acceleration + 4-socket Intel® Xeon® processor E7-8890 v3 using tables in-memory (1 TB total) completing the same queries in about 52.3 seconds.  For more complete information visit http://www.intel.com/performance/datacenter

4 One server was powered by a four-socket Intel® Xeon® processor E7-8890 v3 and another server with a four-socket Intel Xeon processor E7-4890 v2. Each server was configured with 512 GB DDR4 DRAM, Red Hat Enterprise Linux 6.5*, and Apama 5.2*. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

5 Up to 1.72x generational claim based on SAS* Mixed Analytics workload measuring sessions per hour using SAS* Business Analytics 9.4 M2 on Red Hat* Enterprise Linux* 7. Configurations: 1) Baseline: 4S Intel® Xeon® processor E7-4890 v2, 512 GB DDR3-1066 memory, 16x 800 GB Intel® Solid-State Drive Data Center S3700, scoring 0.11 sessions/hour. 2) Up to 1.72x more sessions per hour: 4S Intel® Xeon® processor E7-8890 v3, 512 GB DDR4-1600 memory, 4x 2.0 TB Intel® Solid-State Drive Data Center P3700 + 8x 800 GB Intel® Solid-State Drive Data Center S3700, scoring 0.19 sessions/hour.

6 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/performance

7 Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase.

By David Fair, Unified Networking Marketing Manager, Intel Networking Division

 

Certainly one of the miracles of technology is that Ethernet continues to be a fast growing technology 40 years after its initial definition.  That was May 23, 1973 when Bob Metcalf wrote his memo to his Xerox PARC managers proposing “Ethernet.”  To put things in perspective, 1973 was the year a signed ceasefire ended the Vietnam War.  The U.S. Supreme Court issued its Roe v. Wade decision. Pink Floyd released “Dark Side of the Moon.”

 

In New York City, Motorola made the first handheld mobile phone call (and, no, it would not fit in your pocket).  1973 was four years before the first Apple II computer became available, and eight years before the launch of the first IBM PC. In 1973, all consumer music was analog: vinyl LPs and tape.  It would be nine more years before consumer digital audio arrived in the form of the compact disc—which, ironically, has long since been eclipsed by Ethernet packets as the primary way digital audio gets to consumers.

 

motophone.jpg

 

The key reason for Ethernet’s longevity, IMHO, is its uncanny, Darwinian ability to evolve to adapt to ever-changing technology landscapes.  A tome could be written about the many technological challenges to Ethernet and its evolutionary response, but I want to focus here on just one of these: the emergence of multi-core processors in the first decade of this century.

 

The problem Bob Metcalf was trying to solve was how to get packets of data from computers to computers, and, of course, to Xerox laser printers.  But multi-core challenges that paradigm because Ethernet’s job as Bob defined it, is done when data gets to a computer’s processor, before it gets to the correct core in that processor waiting to consume that data.

 

Intel developed a technology to help address that problem, and we call it Intel® Ethernet Flow Director.  We implemented it in all of Intel’s most current 10GbE and 40GbE controllers. What Intel® Ethernet Flow Director does, in a nutshell, is establish an affinity between a flow of Ethernet traffic and the specific core in a processor waiting to consume that traffic.

 

I encourage you to watch a two and a half minute video explanation of how Intel® Ethernet Flow Director works.  If that, as I hope, whets your appetite to learn more about this Intel technology, we also have a white paper that delves into deeper details with an illustration of what Intel® Ethernet Flow Director does for a “network stress test” application like Memcached.  I hope you find both the video and white paper enjoyable and illuminating.

 

Intel, the Intel logo, and Intel Ethernet Flow Director are trademarks of Intel Corporation in the U.S. and/or other countries.

 

*Other names and brands may be claimed as the property of others.

In today’s world, engineering teams can be located just about anywhere in the world, and the engineers themselves can work from just about any location, including home offices. This geographic dispersion creates a dilemma for corporations that need to arm engineers with tools that make them more productive while simultaneously protecting valuable intellectual property—and doing it all in an affordable manner.

 

Those goals are at the heart of hosted workstations that leverage new combinations of technologies from Intel and Citrix*. These solutions, unveiled this week at the Citrix Synergy 2015 show in Orlando, allow engineers to work with demanding 3D graphics applications from virtually anywhere in the world, with all data and applications hosted in a secure data center. Remote users can work from the same data set, with no need for high-volume data transfers, while enjoying the benefits of fast, clear graphics running on a dense, cost-effective infrastructure.

 

These solutions are in the spotlight at Citrix Synergy. Event participants had the opportunity to see demos of remote workstations capitalizing on the capabilities of the Intel® Xeon® processor E3-1200 product family and Citrix XenApp*, XenServer*, XenDesktop*, and HDX 3D Pro* software.

 

Show participants also had a chance to see demos of graphics passthrough with Intel® GVT-d in Citrix XenServer* 6.5, running Autodesk* Inventor*, SOLIDWORKS*, and Autodesk Revit* software. Other highlights included a technology preview of Intel GVT-g with Citrix HDX 3D Pro running Autodesk AutoCAD*, Adobe* Photoshop*, and Google* Earth.

 

Intel GVT-d and Intel GVT-g are two of the variants of Intel® Graphics Virtualization Technology. Intel GVT-d allows direct assignment of an entire GPU’s capabilities to a single user—it passes all of the native driver capabilities through the hypervisor. Intel GVT-g allows multiple concurrent users to share the resources of a single GPU.

 

The new remote workstation solutions showcased at Citrix Synergy build on a long, collaborative relationship between engineers at Intel and Citrix. Our teams have worked together for many years to help our mutual customers deliver a seamless mobile and remote workspace experience to a distributed workforce. Users and enterprises both benefit from the secure and cost-effective delivery of desktops, apps, and data from the data center to the latest Intel Architecture-based endpoints.

 

For a closer look at the Intel Xeon processor E3-1200 product family and hosted workstation infrastructure, visit intel.com/workstation.

 

 

Intel, the Intel logo, Intel inside, and Xeon are trademarks of Intel Corporation in the U.S. and other countries. Citrix, the Citrix logo, XenDesktop, XenApp, XenServer, and HDX are trademarks of Citrix Systems, Inc. and/or one of its subsidiaries, and may be registered in the U.S. and other countries. * Other names and brands may be claimed as the property of others.

The “Intel Ethernet” brand symbolizes the decades of hard work we’ve put into improving performance, features, and ease of use of our Ethernet products.

 

What Intel Ethernet doesn’t stand for, however, is any use of proprietary technology. In fact, Intel has been a driving force for Ethernet standards since we co-authored the original specification more than 40 years ago.

 

At Interop Las Vegas last week, we again demonstrated our commitment to open standards by taking part in the NBASE-T Alliance public multi-vendor interoperability demonstration. The demo leveraged our next generation single-chip 10GBASE-T controller supporting the NBASE-T intermediate speeds of 2.5Gbps and 5Gbps (see a video of that demonstration here).

IMG_0254.JPG.jpeg

 

Intel joined the NBASE-T Alliance in December 2014 at the highest level of membership, which allows us to fully participate in the technology development process including sitting on the board and voting for changes in the specification.

 

The alliance, and its 33 members, is an industry-driven consortium that has developed a working 2.5GbE / 5GbE specification that is the basis of multiple recent product announcements. Based on this experience, our engineers are working diligently now to develop the IEEE standard for 2.5G/5GBASE-T.

 

By first developing the technology in an industry alliance, vendors can have a working specification to develop products, and customers can be assured of interoperability.

 

The reason Ethernet has been so widely adopted over the past 40 years is its ability to adapt to new usage models. 10GBASE-T was originally defined to be backwards compatible to 1GbE and 100Mbs, and required category 6a or category 7 cabling to get 10GbE. Adoption of 10GBASE-T is growing very rapidly in the datacenter, and now we are seeing the need for more bandwidth in enterprise and campus networks to support the next generation 802.11AC access points, local servers, workstations, and high-end PCs.

 

Copper twisted pair has long been the cabling preference for enterprise data centers and campus networks, and most enterprises have miles and miles of this cable already installed throughout their buildings. In the past 10 years alone, about 70 billion meters of category 5e and category 6 cabling have been sold worldwide.


Supporting higher bandwidth connections over this installed cabling is a huge win for our customers. Industry alliances can be a useful tool to help Ethernet adapt, and the NBASE-T alliance enables the industry to address the need for higher bandwidth connections over installed cables.


Intel is the technology and market leader in 10GBASE-T network connectivity. I spoke about Intel's investment in the technology in an earlier blog about Ethernet’s ubiquity.

 

We are seeing rapid adoption of our 10GBASE-T products in the data center, and now through the NBASE-T Alliance we have a clear path to address enterprise customers with the need for more than 1GbE. Customers are thrilled to hear that they can get 2.5GbE/ 5GbE over their installed Cat 5e copper cabling—making higher speed networking between bandwidth-constrained endpoints achievable.

 

Ethernet is a rare technology in that it is both mature (more than 40 years old since its original definition in 1973) and constantly evolving to meet new network demands. Thus, it has created an expectation by users that the products will work the first time, even if they are based on brand new specifications. Our focus with Intel Ethernet products is to ensure that we implement solutions that are based on open standards and that these products seamlessly interoperate with products from the rest of the industry.

 

If you missed the NBASE-T demonstration at Interop, come see how it works at Cisco Live in June in San Diego.

stylized_city_photo-s.jpg

When I started my career in IT, infrastructure provisioning involved a lot of manual labor. I installed the hardware, installed the operating systems, connected the terminals, and loaded the software and data, to create a single stack to support a specific application. It was common to have one person who carried out all of these tasks on a single system with very few systems in an Enterprise.

 

Now let’s fast forward to the present. In today’s world, thanks to the dynamics of Moore’s Law and the falling cost of compute, storage, and networking, enterprises now have hundreds of applications that support the business. Infrastructure and applications are typically provisioned by teams of domain specialists—networking admins, system admins, storage admins, and software folks—each of whom puts together a few pieces of a complex technology puzzle to enable the business.

 

While it works, this approach to infrastructure provisioning has some obvious drawbacks. For starters, it’s labor-intensive with too many hands in order to support, it’s costly in both people and software, and it can be rather slow from start to finish. While the first two are important for TCO, it is the third that I have heard the most about… Just too slow for the pace of business in the era of fast-moving cloud services.

 

How do you solve this problem? That is what the Software Defined Infrastructure is all about. With SDI, compute, network, and storage resources are deployed as services, potentially reducing deployment times from weeks to minutes. Once services are up and running, hardware is managed as a set of resources, and software has the intelligence to manage the hardware to the advantage of the supported workloads. The SDI environment automatically corrects issues and optimizes performance to ensure you can meet your service levels and security controls that your business demands.

 

So how do you get to SDI? My current response is that SDI is a destination that sits at the summit for most organizations. At the simplest level, there are two routes to this IT nirvana—a “buy it” high road and a “build-it-yourself” low road. I call the former a high road because it’s the easiest way forward—it’s always easier to go downhill than uphill. The low road has lots of curves and uphill stretches on it to bring you to the higher plateau of SDI.  Each of these approaches has its advantages and disadvantages.

 

The high road, or the buy-the-packaged-solution route, is defined by system architectures that bring together all the components for an SDI into a single deployable unit. Service providers who take you on the high road leverage products like Microsoft Cloud Platform System (CPS) and VMware EVO: RAIL to create standalone platform units with virtualized compute, storage, and networking resources.

 

On the plus side, the high road offers faster time to market for your SDI environment, a tested and certified solution, and the 24x7 support most enterprises are looking for in a path.  These are also the things you can expect in a solution delivered by a single vendor. On the downside, the high road locks you into certain choices in the hardware and software components and forces you to rely on the vendor for system upgrades and technology enhancements, which might happen faster with other solutions, but take place in their timelines. This approach, of course, can be both Opex and Capex heavy, depending on the solution.

 

The low road, or the build-it-yourself route, gives you the flexibility to design your environment and select your solution components from the portfolio of various hardware, software vendors and open source. You gain the agility and technology choices that come with an environment that is not defined by a single vendor. You can pick your own components and add new technologies on your timelines—not your vendor’s timelines—and probably enjoy lower Capex along the way, although at the expense of more internal technical resources.

 

Those advantages, of course, come with a price. The low road can be a slower route to SDI, and it can be a drain on your staff resources as you engage in all the heavy lifting that comes with a self-engineered solution set.  Also, it is quite possible with the pace of innovations that you see today in this area, that you never really achieve the vision of SDI due to all the new choices. You have to design your solution; procure, install, and configure the hardware and software; and add the platform-as-a-service (PaaS) layer. All of that just gets to a place where you can start using the environment. You still haven’t optimized the system for your targeted workloads.

 

In practice, most enterprises will take what amounts to a middle road. This hybrid route takes the high road to SDI with various detours onto the low road to meet specific business requirements. For example, an organization might adopt key parts of a packaged solution but then add its own storage or networking components or decide to use containers to implement code faster.

 

Similarly, most organizations will get to SDI in stepwise manner. That’s to say they will put elements of SDI in place over time—such as storage and network virtualization and IT automation—to gain some of the agility that comes with an SDI strategy. I will look at these concepts in an upcoming post that explores an SDI maturity model.

Management practices from the HPC world can get even bigger results in smaller-scale operations.

 

In 2014, industry watchers have seen a major rise in hyperscale computing. Hadoop and other cluster architectures that originated in academic and research circles have become almost commonplace in the industry. Big data and business analytics are driving huge demand for computing power, and 2015 should be another big year in the datacenter world.

 

What would you do if you had the same operating budget as one of the hyperscale data centers? It might sound like winning the lottery, or entering a world without limitations, but any datacenter manager knows that infrastructure scaling requires tackling even bigger technology challenges -- which is why it makes sense to watch and learn from the pioneers who are pushing the limits.

 

Lesson 1: Don't lose sight of the "little" data

 

When the datacenter scales up, most IT teams look for a management console that can provide an intuitive, holistic view that simplifies common administrative tasks. When managing the largest-scale datacenters, the IT teams have also learned to look for a console that taps into the fine-grained data made available by today's datacenter platforms. This includes real-time power usage and temperature for every server, rack, row, or room full of computing equipment.

 

Management consoles that integrate energy management middleware can aggregate these datacenter data points into at-a-glance thermal and power maps, and log all of the data for trend analysis and capacity planning. The data can be leveraged for a variety of cost-cutting practices. For example, datacenter teams can more efficiently provision racks based on actual power consumption. Without an understanding of real-time patterns, datacenter teams must rely on power supply ratings and static lab measurements.

 

A sample use case illustrates the significant differences between real-time monitoring and static calculations. When provisioning a rack with 4,000 watts capacity, traditional calculations resulted in one datacenter team installing approximately 10 servers per rack. (In this example, the server power supplies are rated at 650 watts, and lab testing has shown that 400 watts is a safe bet for expected configurations.)

 

The same team carried out real-time monitoring of power consumption, and found that servers rarely exceeded 250 watts. This knowledge led them to increase rack provisioning to 16 servers -- a 60% increase in capacity. To prevent damage in the event that servers in any particular rack create demand that would push the total power above the rack threshold, the datacenter team simultaneously introduced protective power capping for each rack, which is explained in more detail in Lesson 5 below.

 

Lesson 2: Get rid of your ghosts

 

Once a datacenter team is equipped to monitor real-time power consumption, it becomes a simple exercise to evaluate workload distribution across the datacenter. Servers and racks that are routinely under-utilized can be easily spotted. Over time, datacenter managers can determine which servers can be consolidated or eliminated. Ghost servers, the systems that are powered up but idle, can be put into power-conserving sleep modes. These and other energy-conserving steps can be taken to avoid energy waste and therefore trim the utility budget. Real-world cases have shown that the average datacenter, regardless of size, can trim 15 to 20 percent by tackling ghost servers.

 

Lesson 3: Choose software over hardware

 

Hyperscale operations often span multiple geographically distributed datacenters, making remote management vital for day-to-day continuity of services. The current global economy has put many businesses and organizations into the same situation, with IT trying to efficiently manage multiple sites without duplicating staff or wasting time traveling between locations.

 

Remote keyboard, video, and mouse (KVM) technology has evolved over the past decades, helping IT teams keep up, but hardware KVM solutions have as a result become increasingly complex. To avoid managing the management overlay itself, the operators of many of the world's largest and most complex infrastructures are adopting software KVM solutions and more recently virtualized KVM solutions.

 

Even for the average datacenter, the cost savings add up quickly. IT teams should add up the costs of any existing KVM switches, dongles, and related licensing costs (switch software, in-band and out-of-band licenses, etc.). A typical hardware KVM switching solution can cost more than $500K for the switch, $125K for switch software, and another $500K for in-band and out-of-band node licenses. Even the dongles can add up to more than $250K. Alternatively, software KVM solutions can avoid more than $1M in hardware KVM costs.

 

Lesson 4: Turn up the heat

 

With many years of experience monitoring and managing energy and thermal patterns, some of the largest datacenters in the world have pioneered high ambient temperature operation. Published numbers show that raising the ambient temperature in the datacenter by 1°C results in a 2% decrease in the site power bill.

 

hyperscale-column-image.jpg

 

It is important to regularly check for hot spots and monitor datacenter devices in real time for temperature-related issues when raising ambient temperature of a datacenter. With effective monitoring, the operating temperature can be adjusted gradually and the savings evaluated against the budget and capacity plans.

 

Lesson 5: Don't fry your racks

 

Since IT is expected -- mandated -- to identify and avoid failures that would otherwise disrupt critical business operations, any proactive management techniques that have been proven in hyperscale datacenters should be evaluated for potential application in smaller datacenters. High operating temperatures can have a devastating effect on hardware, and it is important to keep a close eye on how this can impact equipment uptime and life cycles.

 

Many HPC clusters, such as Hadoop, build in redundancy and dynamic load balancing to seamlessly recover from failures. The same foundational monitoring, alerts, and automated controls that help minimize hyperscale energy requirements can help smaller sites identify and eliminate hot spots that have a long-term impact on equipment health. The holistic approach to power and temperature also helps maintain a more consistent environment in the datacenter, which ultimately avoids equipment-damaging temperatures and power spikes.

 

Besides environment control, IT teams can also take advantage of leading-edge energy management solutions that offer power-capping capabilities. By setting power thresholds, racks can be liberally provisioned without the risk of power spikes. In some regions, power capping is crucial for protecting datacenters from noisy, unreliable power sources.

 

Following the leaders

 

Thankfully, most datacenters operate on a scale with much lower risks compared to the largest datacenters and hyperscale computing environments. However, datacenters of any size should make it a priority to reduce energy costs and avoid service disruptions. By adopting proven approaches and taking advantage of all the real-time data throughout the datacenter, IT and facilities can follow the lead of hyperscale sites and get big results with relatively small initial investments.

By David Fair, Unified Networking Marketing Manager, Intel Networking Division

 

iWARP was on display recently in multiple contexts.  If you’re not familiar with iWARP, it is an enhancement to Ethernet based on an Internet Engineering Task Force (IETF) standard that delivers Remote Direct Memory Access (RDMA).

 

In a nutshell, RDMA allows an application to read or write a block of data from or to the memory space of another application that can be in another virtual machine or even a server on the other side of the planet.  It delivers high bandwidth and low latency by bypassing the kernel of system software and avoiding the interrupts and making of extra copies of data that accompany kernel processing.

 

A secondary benefit of kernel bypass is reduced CPU utilization, which is particularly important in cloud deployments. More information about iWARP has recently been posted to Intel’s website if you’d like to dig deeper.

 

Intel® is planning to incorporate iWARP technology in future server chipsets and systems-on-a-chip (SOCs).  To emphasize our commitment and show how far along we are, Intel showed a demo using the RTL from that future chipset in FPGAs running Windows* Server 2012 SMB Direct and doing a boot and virtual machine migration over iWARP.  Naturally it was slow – about 1 Gbps - since it was FPGA-based, but Intel demonstrated that our iWARP design is already very far along and robust.  (That’s Julie Cummings, the engineer who built the demo, in the photo with me.)

 

pastedImage_17.png

 

Jim Pinkerton, Windows Server Architect, from Microsoft joined me in a poster chat on iWARP and Microsoft’s SMB Direct technology, which scans the network for RDMA-capable resources and uses RDMA pathways to automatically accelerate SMB-aware applications.  With SMB Direct, no new software and no system configuration changes are required for system administrators to take advantage of iWARP.

 

pastedImage_1.png

 

Jim Pinkerton also co-taught the “Virtualizing the Network to Enable a Software Defined Infrastructure” session with Brian Johnson of Intel’s Networking Division.  Jim presented specific iWARP performance results in that session that Microsoft has measured with SMB Direct.

 

Lastly, the Non-Volatile Memory Express* (NVMe*) community demonstrated “remote NVMe,” made possible by iWARP.  NVMe is a specification for efficient communication to non-volatile memory like flash over PCI Express.  NVMe is many times faster than SATA or SAS, but like those technologies, targets local communication with storage devices.  iWARP makes it possible to securely and efficiently access NVM across an Ethernet network.  The demo showed remote access occurring with the same bandwidth (~550k IOPS) with a latency penalty of less than 10 µs.**

 

pastedImage_27.png

 

Intel is supporting iWARP because it is layered on top of the TCP/IP industry standards.  iWARP goes anywhere the Internet goes and does it with all the benefits of TCP/IP, including reliable delivery and congestion management. iWARP works with all existing switches and routers and requires no special datacenter configurations to work. Intel believes the future is bright for iWARP.

 

Intel, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

 

*Other names and brands may be claimed as the property of others.

**Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

May 5th, 2015 was an exciting day for Big Data analytics. Intel hosted an event focused on data analytics, announcing the next generation of the Intel® Xeon® Processor E7 family and sharing an update on Cloudera one year after investing in the company.

 


At the event, I had the pleasure of hosting a panel discussion among three very interesting data science experts:

 

  • David Edwards, VP and Engineering Fellow at Cerner, a healthcare IT and electronic medical records company, has overseen the development of a Cloudera-based Big Data analytics system for patient medical data that has enabled the creation of a number of highly effective predictive models that have already saved the lives of hundreds of patients.

 

  • Don Fraynd, CEO of TeacherMatch, an analytics company that has developed models that correlate a broad variety of school teacher attributes with actual student performance measures to increase the effectiveness of the teacher hiring process. These models are used to identify the most promising candidates for each teaching position, given the individual circumstances of the teaching opportunity.

 

  • Andreas Weigend, Director of the Social Data Lab, professor at Stanford and UC Berkeley, and past Chief Scientist at Amazon, has been a leader in data science since before data science was a “thing.” His insights into measuring customer behavior and predicting how they make decisions has changed the way we experience the Internet.

 

My guests have all distinguished themselves by creating analytics solutions that provide actionable insights into individual human behavior in the areas of education, healthcare and retail.  Over the course of the discussion a major theme that emerged was that data analytics must empower individuals to take action in real time.

 

David described how Cerner’s algorithms are analyzing a variety of patient monitoring data in the hospital to identify patients who are going into septic shock, a life threatening toxic reaction to infection. “If you don’t close that loop and provide that immediate feedback in real time, it’s very difficult to change the outcome.”

 

Don explained how TeacherMatch is “using hot data, dashboards, and performance management practices in our schools to effect decisions in real time…What are the precursors to a student failing a course? What are the precursors to a student having a major trauma event?”

 

Andreas advanced the concept of a dashboard one step further and postulated that a solution analogous to a navigation system is what’s needed, because it can improve the quality of the data over time. “Instead of building complicated models, build incentives so that people share with you…I call this a data refinery…that takes data of the people, data by the people and makes it data to be useful for the people.”

 

Clearly, impactful analytics are as much about timeliness and responsivity as they are about data volume and variety, and they drive actions, not just insights.

 

In his final comments, David articulated one of my own goals for data science: “To make Big Data boring and uninteresting.” In other words, our goal is to make it commonplace for companies to utilize all of their data, both structured and unstructured, to provide better customer experiences, superior student performance or improved patient outcomes. As a data scientist, I can think of no better outcome for the work I do every day.

 

Thanks to our panelists and the audience for making this an engaging and informative event. Check out the full panel to get all of the great insights.

Filter Blog

By date:
By tag: