IEEE CloudCom 2011 Trends and hot topics – Part 1

Data architecture

December 1, 2011

This week I attended the 3rd IEEE International conference on cloud computing Technology and science for presenting our paper about the Elastic Queuing Service. This offers me a new opportunity to give you an overview of hot topics and trends from this conference. As usual I will only give you an overview of the talks I had the chance to attend, since there was a lot of parallel tracks.

MapReduce in the cloud

This is clearly a hot topic in the community, there was a special MapReduce track and even a lot of papers distributed among the other tracks and workshops. Those papers were mainly focused on the MapReduce improvements:

Better fault tolerance and high availability
New schedulers, especially trying to optimize the local-data task assignation
Attempts to make MR asynchronous and a little less batch than it is today

I have to notice a really interesting paper about the convergence between MR and the streaming programming [1]. The idea is to extend the reduce phase to a sliding window of reduce, in which the reducers can still receive inputs. As a result you can integrate the MR process into a real stream approach. I found it really cool, because it can be used (1) in new generation of Data warehouse in which you push data in the data processing stream instead of extracting it like it is done today and (2) you can even think about implementing a kind of MR-based CEP by letting MR and a streaming platform implementing your event storage and even correlation software.

We had even a complete tutorial session by a Yahoo! architect about Hadoop HDFS, PigLatin [2] and Oozie [3]. For those of you who think that Pig Latin is an Italian Pig and Oozie a former rock star doing Reality shows, you have to know that Pig is the procedural data processing language developed on top of MR. The idea is that you should be able to express data flow processing by expressing the requests as a set of SQL like statements. Behind the scene, Pig generate a logical and a physical execution plan and finally the set of MR jobs that must be scheduled.

Oozie is a workflow scheduler for Hadoop. The idea is that you can express a real workflow of operations (a directed acyclic graph of actions) as MR operations, but also Pig operations. It is a way to have a kind of orchestration of MR jobs. A nice example of utilization is described by a cloudEra blog post [4].

I have to say that I was a little bit disappointed on that track. Every talk started by “MR is increasingly popular, then we have to work with …” but nobody spoke about the limitation of the paradigm and complexity to deal with something else than Map and reduce.

Cloud Architecture

There was a lot of papers about new architectures in Cloud for interoperability, PaaS container scalability, privacy and security management. The interoperability is clearly a hot topic, it has been addressed as resource management description languages, but also in term of contextualization [5]. The idea behind the concept is to be able to model the context in which the service (and its VMs) is executed. This could include, the IP addresses of the nodes composing the service, the admin domain in which you are running and all the associated policies, the type of VM you are using, the VPN topology configuration, the data location & data management, etc. Therefore, the contextualization runtime should be able to configure your service through the extensions points, for making your service compliant with its context.

It is worth noting that there was a few papers describing optimized architectures in term of resource energy management and SLA management. Although the SLA management is still a great challenge, there was not a lot about. The majority of those papers came from former researchers of the reference project in this area SLA@SOI [6]. The basic idea of the SLA is that you define your SLA template (Business SLA, service SLA, infrastructure SLA) that you can negotiate with the infrastructure operator. The negotiation is made by analysing the available resources comparing the price and SLA requested. Once the negotiation phase in finished, the cloud operator must define the monitor probes requested to monitor the SLA and defining the enforcement SLA logic.
However, in a private cloud, you are limited in term of resources, then how to deal with negotiated SLA? There was an interesting research [7] about merging the results of SLA@SOI and the advanced resource reservation [8]. I strongly recommend you to take a look.

That is already a lot for this post, so I will let my considerations about cloud standardisation challenges in the next post.
To be continued …

References

[1] Andrey Brito and al., Scalable and Low-Latency Data Processing with Stream MapReduce. In Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science, IEEE CloudCom 2011. IEEE Press, November 2011.

[2] Hadoop PigLatin project, http://pig.apache.org/

[3] Apache Oozie project, http://incubator.apache.org/oozie/

[4] Tim Robertson, Lars Francke and Oliver Meyn. Biodiversity Indexing: Migration from MySQL to Hadoop, June 2011, http://www.cloudera.com/blog/2011/06/biodiversity-indexing-migration-from-mysql-to-hadoop/

[5] Django Armstrong and al., Towards a Contextualization Solution for Cloud Platform Services. In Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science, IEEE CloudCom 2011. IEEE Press, November 2011.

[6] SLA@SOI FP7 Project, http://sla-at-soi.eu/

[7] Kuan Lu and al., QoS-aware SLA-based Advanced Reservation of Infrastructure as a Service. In Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science, IEEE CloudCom 2011. IEEE Press, November 2011.

[8] B.Sotomayor, R.Santiago Montero, I.Martín Llorente, I.Foster, Virtual Infrastructure Management in Private and Hybrid Clouds. IEEE Internet Computing, vol. 13, no. 5, pp. 14-22, Sep./Oct. 2009

Releated Posts

Insights from GTC Paris 2025

25.06.2025 / Engineering / Blog, Event

Among the NVIDIA GTC Paris crowd was our CTO Sabri Skhiri, and from quantum computing breakthroughs to the full-stack AI advancements powering industrial digital twins and robotics, there is a lot to share! Explore with Sabri GTC 2025 trends, keynotes, and what it means for businesses looking to innovate.

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

23.05.2025 / Engineering / Academic collaborations, Papers

Tumor tracking in PET/CT is essential for monitoring cancer progression and guiding treatment strategies. Traditionally, nuclear physicians manually track tumors, focusing on the five largest ones (PERCIST criteria), which is both time-consuming and imprecise. Automated tumor tracking can allow matching of the numerous metastatic lesions across scans, enhancing tumor change monitoring.

IEEE CloudCom 2011 Trends and hot topics – Part 1

MapReduce in the cloud

Cloud Architecture

References

Releated Posts

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Recent Posts

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Insights from Data & AI Tech Summit Warsaw 2025

Insights From Flink Forward 2024

Tracks

Mjolnir

Rune

Vadgelmir

Yggdrasil

Field of expertises

Data architecture

Data governance

Data science

Engineering

Academic collaboration

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media