Welcome!

Microservices Expo Authors: Elizabeth White, Aruna Ravichandran, Liz McMillan, Pat Romanski, Cameron Van Orman

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Microsoft Cloud, Open Source Cloud, Agile Computing

@BigDataExpo: Article

An Alternative to Stored Procedure in Big Data Computation

Easier than stored procedure

As we know, the stored procedure is designed to handle computations involving complex business logics.

In the past, the data structure and business logics were so simple that one SQL statement was enough to achieve user's computational goal. With the rapid growing of information industry, users frequently find that they need to achieve the increasingly complex computational goals to out-perform their competitors. To address such computations, SQL alone is far from enough. Database programmers have the additional demands regarding the judge and loop statements, branches at multiple levels, or more accurate data transverse operations, as well as decomposing an obscure goal into several clear and actionable steps correlated with complex logics. It was for all these demands that the stored procedure was introduced.

The stored procedure has so far become the preferred tools for complex data computations, playing an quite important role. However, stored procedures still cause various inconveniences. For example, lots of their functions are inconvenient - hard to debug or migrate; and there are some databases with rather weak stored procedures. These problems sometimes affect the efficiency of database developers seriously.

The inconvenient functions of stored procedure include the incomplete step-by-step computation, weak support for set-lized data computing, no sequence number can be set for the data set, and no object-reference mechanism. Let's check it out with a simple example - find the "top n best-selling products in whatsoever State" by analyzing the regional sales table. In this scenario, those inconveniences makes it quite complex to write a stored procedure.

01 create or replace package salesPkg

02 as

03 type salesCur is ref cursor;

04 end;

05 CREATE OR REPLACE PROCEDURE topPro(io_cursor OUT salesPkg.salesCur)

06 is

07 varSql varchar2(2000);

08 tb_count integer;

09 BEGIN

10 select count(*) into tb_count from dba_tables where table_name='TOPPROTMP';

11 if tb_count=0 then

12 strCreate:='CREATE GLOBAL TEMPORARY TABLE TOPPROTMP (

stateTmp NUMBER not null,

productTmp varchar2(10)  not null,

amountTmp NUMBER not null

)

ON COMMIT PRESERVE ROWS';

13 execute immediate strCreate;

14 end if;

15 execute immediate 'truncate table TOPPROTMP';

16 insert into TOPPROTMP(stateTmp,productTmp,amountTmp)

select state,product,amount from stateSales a

where not(

(a.state,a.product) in (

select state,product from stateSales group by state,product having count(*) > 1

)

and rowid not in (

select min(rowid) from stateSales group by state,product having count(*)>1

)

)

order by state,product;

17 OPEN io_cursor for

18 SELECT productTmp FROM (

SELECT stateTmp,productTmp,amountTmp,rankorder

FROM (SELECT stateTmp,productTmp,amountTmp,RANK() OVER(PARTITION BY stateTmp ORDER BY amountTmp DESC) rankorder

FROM TOPPROTMP

)

WHERE rankorder<=10 order by stateTmp

)

GROUP BY productTmp

HAVING COUNT(*)=(SELECT COUNT(DISTINCT stateTmp ) FROM TOPPROTMP);

END;

In which, the code at line 16 is to filter the duplicates, and write the filtered data to the "temporary table". Since it is difficult to retrieve the distinct data directly, try this tip: find the duplicate data, then use "not" to reverse the condition, and the remaining is the distinct data. This function can be implemented by embedding two sub-queries.

For another example, the line 18 is to find the products ranking among the top 10 in whatsoever State. Firstly, use the window function to get the product rankings rankorder of each State; Secondly, filter out the top 10 products in each State; Lastly, get the products ranking among the top 10 in whatsoever State.SQL does not provide any functions to seek the intersection sets. So, to address this weak point, here is another tip: group by product so as to check if the number of a same products is equal to the number of States; if they are equal, it indicates that the product is ranking among the top 10 in every State.

Besides the inconvenient functions, the rather weak support for debug function is another inconvenience of stored procedures.

Although there are Oracle, DB2, and other databases offering debug function for their respective stored procedures in the market, their debug functions are incomplete. In running the stored procedure, no matter a SQL statement is long or short, no matter how many nested loops or the computational steps it contains, programmers can only view the result of one statement, and the intermediate procedure is completely transparent to them. It rather defeats the purpose of step-by-step debug, and compromising the benefits of running to cursor or the next step. The programmers are actually only allowed to view the cursor and the simple variables. Such variables are useful, but by no means the same important as the intermediate procedure of SQL. Another trouble is that lots of setting and preparation workloads are required to launch the debugging tools. The beginners can hardly handle it without guides.

The third inconvenience is that the stored procedure is hard to migrate. Generally speaking, SQL can be migrated with a few simple modifications. Despite the slight difference in syntax details, the SQLs from various vendors are all the supersets of the ANSI standard. However, the stored procedure is quite another thing. Migrating a stored procedure is much more complex than rewriting one because the relevant standards of various vendors differ greatly. In this situation, users have no choice but stick to one database vendor rigidly. There is not any room left for users to beat down the price if database vendors overcharge them on upgrading their servers, storages, and user license agreements.

SQL is an essential function of any database, while the stored procedure is not. Some databases only offer the relatively poor stored procedures, and the others do not provide it at all. Take the stored procedures of MySQL for example. Its functions and performances are worse than that of the MS SQL, Oracle and some other databases, and MySQL may throw many exceptions on intensive concurrency. MSSQL Compact, SQLITE, Hive, Access, and other databases do not support the stored procedures.

Obviously, the inconveniences of stored procedures have compromised the computational performance of database, and given much troubles to programmers - considerable development difficulty, inefficient development, and awkward maintenances. In addition, these inconveniences also affect the result of implementing business logics, achieving complex computational goal, and making smart business decision. Then, how to empower the stored procedure?

esProc is a database computing script specially built for addressing the complex computational goal. With the computational performance equally good and even better than that of stored procedures, esProc additionally provides the advantageous intuitive grid style, step-by-step computation, professional debug features, agile syntax, complete computational system, and seamless support for the interactive computations between various databases.

esProc is a scripting tool with a grid style. With esProc, the computational logics can be laid out in a 2D space conveniently. The business algorithm can thus be interpreted in the computer language more easily. esProc advocates the step-by-step computation. To put it concretely, esProc enable users to decompose a complex goal into several simple steps in its grid, and ultimately achieve a complex goal by accomplishing those simple objectives. Designed with the "step-by-step" thoughts, a really practical debug function is introduced with esProc, including various functions like the break point, stepping, run to cursor, start, and end. Unlike the fake debugging script as SQL/SP, esProc can debug the basic steps directly and straightforwardly, needless building a specific intermediate table. The break point can be set in any position without altering the code.

As a big data computing script, esProc supports the true data type of set. A member of a set can be the data of any simple data types, records, and/or other sets. esProc supports the ordered set, which means that users can access to the set member and perform the sequence-number-related computation, for example, ranking, sorting, year-over-year comparison, and link relative ratio comparison. The set-of-set can be used to represent the equal grouping, align grouping, and enum grouping. In addition, users can operate on the individual records in the same way as they used to operate on an object. esProc can represent the complex computation more easily with its agile syntax, for example, computing the relative positions in multi-level groupings, and grouping and summarizing by a specified set.

esProc can empower the stored procedures, and ultimately boost the computational capability of database, reduce the development difficulty for programmers, improve the development efficiency, and facilitate the code maintenance and migration. esProc can implement the complex data algorithms and business logics easily.

Finally, for the above case we've discussed to conclude the inconveniences of stored procedures, let's check out the esProc solution shown below:

More Stories By Jessica Qiu

Jessica Qiu is the editor of Raqsoft. She provides press releases for data computation and data analytics.

@MicroservicesExpo Stories
Is advanced scheduling in Kubernetes achievable? Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, will answer these questions and demonstrate techniques for implementing advanced scheduling. For example, using spot instances ...
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual b...
Transforming cloud-based data into a reportable format can be a very expensive, time-intensive and complex operation. As a SaaS platform with more than 30 million global users, Cornerstone OnDemand’s challenge was to create a scalable solution that would improve the time it took customers to access their user data. Our Real-Time Data Warehouse (RTDW) process vastly reduced data time-to-availability from 24 hours to just 10 minutes. In his session at 21st Cloud Expo, Mark Goldin, Chief Technolo...
Digital transformation leaders have poured tons of money and effort into coding in recent years. And with good reason. To succeed at digital, you must be able to write great code. You also have to build a strong Agile culture so your coding efforts tightly align with market signals and business outcomes. But if your investments in testing haven’t kept pace with your investments in coding, you’ll lose. But if your investments in testing haven’t kept pace with your investments in coding, you’ll...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, will describe how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launchi...
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reducti...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
SYS-CON Events announced today that Cloud Academy has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the leading technology training platform for enterprise multi-cloud infrastructure. Cloud Academy is trusted by leading companies to deliver continuous learning solutions across Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most...
The last two years has seen discussions about cloud computing evolve from the public / private / hybrid split to the reality that most enterprises will be creating a complex, multi-cloud strategy. Companies are wary of committing all of their resources to a single cloud, and instead are choosing to spread the risk – and the benefits – of cloud computing across multiple providers and internal infrastructures, as they follow their business needs. Will this approach be successful? How large is the ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lav...
Many organizations adopt DevOps to reduce cycle times and deliver software faster; some take on DevOps to drive higher quality and better end-user experience; others look to DevOps for a clearer line-of-sight to customers to drive better business impacts. In truth, these three foundations go together. In this power panel at @DevOpsSummit 21st Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, industry experts will discuss how leading organizations build application success from all...
DevSecOps – a trend around transformation in process, people and technology – is about breaking down silos and waste along the software development lifecycle and using agile methodologies, automation and insights to help get apps to market faster. This leads to higher quality apps, greater trust in organizations, less organizational friction, and ultimately a five-star customer experience. These apps are the new competitive currency in this digital economy and they’re powered by data. Without ...
A common misconception about the cloud is that one size fits all. Companies expecting to run all of their operations using one cloud solution or service must realize that doing so is akin to forcing the totality of their business functionality into a straightjacket. Unlocking the full potential of the cloud means embracing the multi-cloud future where businesses use their own cloud, and/or clouds from different vendors, to support separate functions or product groups. There is no single cloud so...
For most organizations, the move to hybrid cloud is now a question of when, not if. Fully 82% of enterprises plan to have a hybrid cloud strategy this year, according to Infoholic Research. The worldwide hybrid cloud computing market is expected to grow about 34% annually over the next five years, reaching $241.13 billion by 2022. Companies are embracing hybrid cloud because of the many advantages it offers compared to relying on a single provider for all of their cloud needs. Hybrid offers bala...
With the modern notion of digital transformation, enterprises are chipping away at the fundamental organizational and operational structures that have been with us since the nineteenth century or earlier. One remarkable casualty: the business process. Business processes have become so ingrained in how we envision large organizations operating and the roles people play within them that relegating them to the scrap heap is almost unimaginable, and unquestionably transformative. In the Digital ...
These days, APIs have become an integral part of the digital transformation journey for all enterprises. Every digital innovation story is connected to APIs . But have you ever pondered over to know what are the source of these APIs? Let me explain - APIs sources can be varied, internal or external, solving different purposes, but mostly categorized into the following two categories. Data lakes is a term used to represent disconnected but relevant data that are used by various business units wit...
The nature of the technology business is forward-thinking. It focuses on the future and what’s coming next. Innovations and creativity in our world of software development strive to improve the status quo and increase customer satisfaction through speed and increased connectivity. Yet, while it's exciting to see enterprises embrace new ways of thinking and advance their processes with cutting edge technology, it rarely happens rapidly or even simultaneously across all industries.
It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service. FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual "function", action, or piece of business logic. They are expected to start within milliseconds...