Skip to content

IEEE CloudCom Conference on Cloud Computing Technology and Science

Last week I was at Tapei with Nam-Luc for presenting the AROM paper. I wanted to come back on the trends of this year at the conference which, by the way, are a really good insight of the hot topics in cloud, distributed computing and HPC. I will not dive into details for each of them, if you have any question just post a comment or send me a mail!

CloudCom2012Article

 

Data Mining on cloud

There was an important part of the keynotes and talks about using the cloud and its computing power for data mining. Even if from a high level view it could make sense I still think that’s a bit bizarre for me to use cloud for this kind of computation, let me argument my view:
(1)  the bandwidth is just awful and we have no control on it, that is an important issue when dealing with important volume of data,
(2) elasticity is not good argument, few of the processing layer can really be elastic, even Hadoop must have a pre-defined set of worker node before starting a job,
(3) multi-tenant: the main purpose of cloud architecture is to sever the maximum of users by minimizing their resource usage, that far away from the data mining objectives that are much more similar as the grid ones,
(4)  performance: because of the virtualization of network and resource, the cloud application are not really high performance oriented,
(5)  industry trends: if we look at the trends in the industry in data mining, especially in the data warehouse and enterprise world, the direction is completely the opposite. The major player focus on high-end appliance with in-memory computing deeply integrated with the underlying platform. In addition, almost all of them propose a cluster of few machines, hardly often more than 10, connected by INFINIBAND.

Using the cloud as a platform for applications

A lot of talks were speaking about using the cloud for vertical markets as e-health, food tracking, internet of things, etc. That is a trend we can see from 1-2 years, a lot of persons think we could use the cloud as a new generation of Service platform. That is interesting because in the Telco world, the concept of SDP (Service Delivery Platform) is well known, however, nobody points out the lack in the cloud to be functionally equivalent to an SDP. As a result we present the cloud as a service delivery environment but each application requires re-building a complete ad-hoc stack. This is an issue that architects and developers will have more and more to face.

Security and accountability

Security is definitively a hot topic. That makes sense since this is one of the top most show stoppers for entering in the cloud. Going further, a really interesting key note from HP (leading the EU Research project A4Cloud [1]) described the concept of accountability of service provider but also from the complete chain of services which composed the main one. Actually, when we think about that, we can quickly realize that the cloud brings the typical issue of outsourcing operations, but in addition it brings: a longer chain of trust, a limited auditability, a new set of liability and legal issue and new set of technical attacks (typically hypervisor oriented). So, the security is definitively a source of research.

Cloud Federation

A lot of people explained the need for federating cloud mainly for vendor lock-in issues but also for reliability and even for security (for sharding encrypted data through different providers). Federation opens the landscape of a new class of technical issues.  Going further, I see the hybrid cloud as a particular case of federation. Indeed, I do not belong to this class of persons who think that hybrid cloud will be asked and applied by every application, mainly because most of the enterprise applications are data-driven, and then consuming a burst on a cloud would require to migrate and synchronize the data between the private and the public cloud. That would require a lot of complexity, however smart architecture design exist as messaging and asynch. distributed cache, but it would require a complete re-design of the applications. Instead, I see more the hybrid cloud as a federation of a private and a public cloud. In this case, some services are running in one of the cloud and the applications run on the other. That is a much more realistic scenario. This is typical case of federation.

Standardization

We have been speaking about Cloud standardization for a while and too much standards and standardization bodies are fighting each other. However, it seems that some initiatives aim at federating the existing standards. The most promising one is the Global Inter-cloud Technology Forum (GICTF)[2].

GPU for HPC

That is interesting to see how High Performance Computing (HPC) researchers have moved from grid to GPU and how they decrease dramatically the cost of their infrastructures while keeping the same performance on GPU. Today, we can not only see computation simulations, but also data mining algorithms and statistic packages. However, GPU still needs to deal with RAM and big data problems go beyond the RAM capacity. That is a really exciting research area.

Missing trend

Everybody speak about big data, but few speak about collecting them. In the off-line discussions I had with researchers, mainly working in IT field, they mentioned the significant problems they had for collecting and aggregating events.

As you can see, there are really interesting and important topics for the research community for the next year!

References

[1] A4cloud CORDIS page, http://cordis.europa.eu/search/index.cfm?fuseaction=proj.document&PJ_RCN=134045
[2] GICTF, http://www.gictf.jp/index_e.html


 

Releated Posts

IEEE Big Data 2023 – A Summary

Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023. In this article, Sabri explores for you the various keynotes and talks that took place during the
Read More

Robust ML Approach for Screening MET Drug Candidates in Combination with Immune Checkpoint Inhibitors

Present study highlights the significance of dataset size in ICI microbiota models and presents a methodology to enhance the performances of a multi-cohort-based ML approach.
Read More