Ptak Associates Tech Blog: Big Data/Analytics Performance

Tuesday, September 29, 2015

Big Data/Analytics Performance – Driving IBM Power System Success

By Rich Ptak

It’s no secret that the action today is in Big Data and the associated Analytics! Whether for business, retail, education, government, medical, media, whatever, the focus is on data! Lots of it! Coming from every direction in every conceivable source and format. It is structured, unstructured, transactional, audio and visual flowing from IoT, mobile, social, production…to the tune of some 2.5 quintillion new bytes generated every single day.

IT is tasked with processing this raw data into the insights, wisdom, knowledge that result in new services or deliver solutions to previously impenetrable mysteries. The ultimate goal is to deliver benefits and provide value to users, clients and customers. Processing large amounts of data has been computing’s forte since their inception. BUT now the processing of data and generating results is immensely more complicated and must be delivered more quickly and economically than ever before.

IBM’s Power8 was specifically designed as a Big Data server with industry leading memory bandwidth, thread density, and cache architecture. It has the analytics tools[1], operating systems[2], databases[3] to be the System of Insight equipped to deal with the software, performance and management challenges of Big Data analysis, integration and governance.

And, in discussions with users, we’ve seen that it delivers. See our blogs about customer[4] success at dealing with Big Data challenges using Power8 systems. Whether the goal is near real-time response (1.5 microsecond Algo-Logic’s Tick-to-Trade); significant cost savings with improved performance (IBM Platinum Partner Redis Labs processes more REDIS-NoSQL transactions with faster response times with fewer CAPI-Power8 servers); or TalkTalk [5], a UK communications service provider, updating their network and improving the service to their customers by switching to Power-CAPI powered servers.

No industry-standard benchmark existed for Apache Spark [6] until IBM developed the SparkBench benchmark suite. The first version includes 10 benchmarks covering four use cases: Machine Learning, Graph, SQL and Streaming Spark. The results are that a wide variety of Spark workloads consistently run 2x faster on POWER8 than competitor platforms. (FACT: POWER8 with 24 processor cores runs 37% faster than Haswell with 36 processor cores.) You can get SparkBench details and results here [7]. And, if you want to make sure that the SparkBench is the REAL thing, it is available to the public here [8]. IBM recently announced LinuxONE [9] for the mainframe world, we expect more interesting information in the October 5th webcast on new capabilities and products. We’ve registered and suggest that you do so also at: http://tinyurl.com/nctlofd.

[1] Hadoop, Big Insights, DB2BLU and Spark

[2] Red Hat, SuSE and Ubuntu

[3] Oracle, DB2LUW, MariaDB, MongoDB, PostgreSQL

[4] http://ptakassociates.blogspot.com/

[5] https://www.youtube.com/watch?v=nQEgXxYCobI

[6] An in-memory distributed compute engine to complete analysis on large-scale data sets up to 100X faster than current technologies. More info on Apache Spark here: http://tinyurl.com/nta8zvz

[7] http://tinyurl.com/nfbb2jo

[8] https://github.com/SparkTC/spark-bench

[9] http://ptakassociates.blogspot.com/

12 comments:

UnknownSeptember 30, 2015 at 11:15 AM
Actually on SPARC bench IBM testing crippled x86 servers down to 24-cores to be same as the maximum of IBM poor S822L which only has a max of 24 cores. Please read reports you refer to !!!

This artificially limits x86 performance. IBM also has to play with SMT2 and SMT4 to turn off threads evidently too the Power8 design has contention that causes CUSTOMERS to know hardware details
ReplyDelete
Replies
Ptak AssociatesNovember 25, 2015 at 11:55 AM
We took your comment seriously since we stand behind our work and we decided to investigate it. We did so with some help from IBM’s Randy Swanberg. He is the author of the IBM blog documenting the SPARK benchmark that IBM ran. You can read IBM’s report on the benchmark at: https://www.ibm.com/developerworks/community/blogs/f0f3cd83-63c2-4744-9021-9ff31e7004a9/entry/Apache_Spark_Runs_2X_Faster_on_IBM_s_POWER8?lang=en.

We found no evidence of crippling. IBM compared a Power Systems S822L and an HP DL380. Both systems had a maximum of 24 cores. We think that this is a fair comparison. Software is frequently priced by the number of cores in the system. Clearly, many software vendors think that the number of cores is a valid system characterization. Of course, there are other Intel systems that have more cores, but that could also be said of the Power systems. Both systems had essentially the same hardware and software configuration. The details are in the referenced IBM blog.

POWER8 has 8 SMT (Simultaneous Multithreading) threads per core. The Intel architecture only has 2 SMT threads per core. This means that applications able to take advantage of the POWER8’s 8 threads will get a performance advantage. We see nothing wrong with a vendor using standard features in their architecture when benchmarking. If that gives them a performance advantage over a competitor, then so be it. There is nothing underhanded in that. Of course, not every application can take advantage of the 8 threads, so turning off threads (via SMT2 or 4) is a legitimate tuning option. IBM carefully explains in their blog exactly what they did. They fully describe that where there were possibilities to tune the X86, IBM utilized them.

We conclude that your criticism of IBM’s benchmark is unwarranted. We stand behind our work. Our investigation here clearly shows that IBM did nothing to “cripple” the Intel system. Our blog is accurate. The subject benchmark is a fair comparison of the two companies’ technology as measured by the Spark benchmark. Obviously, new Intel and IBM systems will be announced; the benchmark will need to be repeated to determine how the new systems might relate. Also, this benchmark result should not be generalized to other workloads.
ReplyDelete
Replies
MacrosoftJuly 18, 2016 at 4:26 AM
Big data analytics playing main role in business data.

Big Data Analytics Services
ReplyDelete
Replies
DeepakalaSeptember 23, 2016 at 1:50 AM
Thanks for sharing such a great information..Its really nice and informative.

Data Visualization Training Institutes in Chennai Trichy
ReplyDelete
Replies
UnknownNovember 22, 2017 at 11:37 AM
Amitysoft is one of the best Big Data training center in chennai
ReplyDelete
Replies
TamilarasanJanuary 4, 2018 at 3:19 PM
Amitysoft the first and foremost in Software Testing Training In ChennaiAmitysoft is pioneer in offering Software Testing Training In Chennai. In fact, Amitysoft launched country’s first career courses in software testing way back in 1998. Amitysoft’ s career training emerged as best Software Testing Training In Chennai within few years as trained students are immediately absorbed by the industry.
ReplyDelete
Replies
logistic-solutionsMarch 22, 2018 at 5:40 AM

Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
sap business intelligence services
ReplyDelete
Replies
Fuel Digital MarketingJune 18, 2020 at 7:20 AM
Thanks for sharing your blog.We are the Best Digital Marketing Agency in Chennai, Coimbatore, Madurai and change makers of digital! For Enquiry Contact us @+91 9791811111

best digital marketing agency in chennai
best content marketing team in india
best content marketers in chennai
content marketing in india
top digital marketing agency in coimbatore
seo experts in chennai
digital marketing consultants in chennai
ReplyDelete
Replies
Email-Helpline-Number-UKJuly 10, 2020 at 11:50 PM
If login issues are to be fixed on TalkTalk mail then in that case, it is advised to check the internet connection and also the login credentials if it is about the internet then the router or the modem should be given a reset if the reset procedure creates a problem for the user then in that case the user can ask the help of the experts available at TalkTalk support.
TalkTalk Help Number UK.
ReplyDelete
Replies
AnonymousJuly 25, 2020 at 6:09 AM
thanks for great article blog keep posting us like this.Riverdayspa, T.Nagar, provide a wide range of spa, body massage, scrub, wrap and beauty parlour services. We ensure unique care and quality service.

chennai spa massage centre | massage in chennai| massage center in tnagar| body massage centre tnagar| chennai massage centre contact number
ReplyDelete
Replies
DeviAugust 4, 2020 at 9:35 AM
Nice blog. thanks for sharing valuable information. oracle training in chennai
ReplyDelete
Replies
Email-Helpline-Number-UKAugust 20, 2020 at 6:04 AM
TalkTalk mail is the best email service and you sometimes get stuck when it stops working on your device. This problem might occur because of an old version of the TalkTalk mail or operating system. In this case, you will need to update your OS and mail if any available. Or sometimes the issue is from server end so just call on +44-800-368-9067 to get connected with the technical experts regarding the resolution. The teams are always there to help you in any manner they can because they have knowledge of TalkTalk related errors.
TalkTalk Mail Help Desk Number UK.
ReplyDelete
Replies

Add comment

Ptak Associates Tech Blog

Pages

Tuesday, September 29, 2015

Big Data/Analytics Performance – Driving IBM Power System Success

By Rich Ptak

12 comments:

Pages