Microservices Expo Authors: Hollis Tibbetts, Elizabeth White, Liz McMillan, Sematext Blog, Lori MacVittie

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Microsoft Cloud, Open Source Cloud, Agile Computing

@BigDataExpo: Article

An Alternative to Stored Procedure in Big Data Computation

Easier than stored procedure

As we know, the stored procedure is designed to handle computations involving complex business logics.

In the past, the data structure and business logics were so simple that one SQL statement was enough to achieve user's computational goal. With the rapid growing of information industry, users frequently find that they need to achieve the increasingly complex computational goals to out-perform their competitors. To address such computations, SQL alone is far from enough. Database programmers have the additional demands regarding the judge and loop statements, branches at multiple levels, or more accurate data transverse operations, as well as decomposing an obscure goal into several clear and actionable steps correlated with complex logics. It was for all these demands that the stored procedure was introduced.

The stored procedure has so far become the preferred tools for complex data computations, playing an quite important role. However, stored procedures still cause various inconveniences. For example, lots of their functions are inconvenient - hard to debug or migrate; and there are some databases with rather weak stored procedures. These problems sometimes affect the efficiency of database developers seriously.

The inconvenient functions of stored procedure include the incomplete step-by-step computation, weak support for set-lized data computing, no sequence number can be set for the data set, and no object-reference mechanism. Let's check it out with a simple example - find the "top n best-selling products in whatsoever State" by analyzing the regional sales table. In this scenario, those inconveniences makes it quite complex to write a stored procedure.

01 create or replace package salesPkg

02 as

03 type salesCur is ref cursor;

04 end;

05 CREATE OR REPLACE PROCEDURE topPro(io_cursor OUT salesPkg.salesCur)

06 is

07 varSql varchar2(2000);

08 tb_count integer;


10 select count(*) into tb_count from dba_tables where table_name='TOPPROTMP';

11 if tb_count=0 then


stateTmp NUMBER not null,

productTmp varchar2(10)  not null,

amountTmp NUMBER not null



13 execute immediate strCreate;

14 end if;

15 execute immediate 'truncate table TOPPROTMP';

16 insert into TOPPROTMP(stateTmp,productTmp,amountTmp)

select state,product,amount from stateSales a

where not(

(a.state,a.product) in (

select state,product from stateSales group by state,product having count(*) > 1


and rowid not in (

select min(rowid) from stateSales group by state,product having count(*)>1



order by state,product;

17 OPEN io_cursor for

18 SELECT productTmp FROM (

SELECT stateTmp,productTmp,amountTmp,rankorder

FROM (SELECT stateTmp,productTmp,amountTmp,RANK() OVER(PARTITION BY stateTmp ORDER BY amountTmp DESC) rankorder



WHERE rankorder<=10 order by stateTmp


GROUP BY productTmp



In which, the code at line 16 is to filter the duplicates, and write the filtered data to the "temporary table". Since it is difficult to retrieve the distinct data directly, try this tip: find the duplicate data, then use "not" to reverse the condition, and the remaining is the distinct data. This function can be implemented by embedding two sub-queries.

For another example, the line 18 is to find the products ranking among the top 10 in whatsoever State. Firstly, use the window function to get the product rankings rankorder of each State; Secondly, filter out the top 10 products in each State; Lastly, get the products ranking among the top 10 in whatsoever State.SQL does not provide any functions to seek the intersection sets. So, to address this weak point, here is another tip: group by product so as to check if the number of a same products is equal to the number of States; if they are equal, it indicates that the product is ranking among the top 10 in every State.

Besides the inconvenient functions, the rather weak support for debug function is another inconvenience of stored procedures.

Although there are Oracle, DB2, and other databases offering debug function for their respective stored procedures in the market, their debug functions are incomplete. In running the stored procedure, no matter a SQL statement is long or short, no matter how many nested loops or the computational steps it contains, programmers can only view the result of one statement, and the intermediate procedure is completely transparent to them. It rather defeats the purpose of step-by-step debug, and compromising the benefits of running to cursor or the next step. The programmers are actually only allowed to view the cursor and the simple variables. Such variables are useful, but by no means the same important as the intermediate procedure of SQL. Another trouble is that lots of setting and preparation workloads are required to launch the debugging tools. The beginners can hardly handle it without guides.

The third inconvenience is that the stored procedure is hard to migrate. Generally speaking, SQL can be migrated with a few simple modifications. Despite the slight difference in syntax details, the SQLs from various vendors are all the supersets of the ANSI standard. However, the stored procedure is quite another thing. Migrating a stored procedure is much more complex than rewriting one because the relevant standards of various vendors differ greatly. In this situation, users have no choice but stick to one database vendor rigidly. There is not any room left for users to beat down the price if database vendors overcharge them on upgrading their servers, storages, and user license agreements.

SQL is an essential function of any database, while the stored procedure is not. Some databases only offer the relatively poor stored procedures, and the others do not provide it at all. Take the stored procedures of MySQL for example. Its functions and performances are worse than that of the MS SQL, Oracle and some other databases, and MySQL may throw many exceptions on intensive concurrency. MSSQL Compact, SQLITE, Hive, Access, and other databases do not support the stored procedures.

Obviously, the inconveniences of stored procedures have compromised the computational performance of database, and given much troubles to programmers - considerable development difficulty, inefficient development, and awkward maintenances. In addition, these inconveniences also affect the result of implementing business logics, achieving complex computational goal, and making smart business decision. Then, how to empower the stored procedure?

esProc is a database computing script specially built for addressing the complex computational goal. With the computational performance equally good and even better than that of stored procedures, esProc additionally provides the advantageous intuitive grid style, step-by-step computation, professional debug features, agile syntax, complete computational system, and seamless support for the interactive computations between various databases.

esProc is a scripting tool with a grid style. With esProc, the computational logics can be laid out in a 2D space conveniently. The business algorithm can thus be interpreted in the computer language more easily. esProc advocates the step-by-step computation. To put it concretely, esProc enable users to decompose a complex goal into several simple steps in its grid, and ultimately achieve a complex goal by accomplishing those simple objectives. Designed with the "step-by-step" thoughts, a really practical debug function is introduced with esProc, including various functions like the break point, stepping, run to cursor, start, and end. Unlike the fake debugging script as SQL/SP, esProc can debug the basic steps directly and straightforwardly, needless building a specific intermediate table. The break point can be set in any position without altering the code.

As a big data computing script, esProc supports the true data type of set. A member of a set can be the data of any simple data types, records, and/or other sets. esProc supports the ordered set, which means that users can access to the set member and perform the sequence-number-related computation, for example, ranking, sorting, year-over-year comparison, and link relative ratio comparison. The set-of-set can be used to represent the equal grouping, align grouping, and enum grouping. In addition, users can operate on the individual records in the same way as they used to operate on an object. esProc can represent the complex computation more easily with its agile syntax, for example, computing the relative positions in multi-level groupings, and grouping and summarizing by a specified set.

esProc can empower the stored procedures, and ultimately boost the computational capability of database, reduce the development difficulty for programmers, improve the development efficiency, and facilitate the code maintenance and migration. esProc can implement the complex data algorithms and business logics easily.

Finally, for the above case we've discussed to conclude the inconveniences of stored procedures, let's check out the esProc solution shown below:

More Stories By Jessica Qiu

Jessica Qiu is the editor of Raqsoft. She provides press releases for data computation and data analytics.

@MicroservicesExpo Stories
SYS-CON Events announced today that Numerex Corp, a leading provider of managed enterprise solutions enabling the Internet of Things (IoT), will exhibit at the 19th International Cloud Expo | @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Numerex Corp. (NASDAQ:NMRX) is a leading provider of managed enterprise solutions enabling the Internet of Things (IoT). The Company's solutions produce new revenue streams or create operating...
Monitoring of Docker environments is challenging. Why? Because each container typically runs a single process, has its own environment, utilizes virtual networks, or has various methods of managing storage. Traditional monitoring solutions take metrics from each server and applications they run. These servers and applications running on them are typically very static, with very long uptimes. Docker deployments are different: a set of containers may run many applications, all sharing the resource...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
When we talk about the impact of BYOD and BYOA and the Internet of Things, we often focus on the impact on data center architectures. That's because there will be an increasing need for authentication, for access control, for security, for application delivery as the number of potential endpoints (clients, devices, things) increases. That means scale in the data center. What we gloss over, what we skip, is that before any of these "things" ever makes a request to access an application it had to...
As we enter the final week before the 19th International Cloud Expo | @ThingsExpo in Santa Clara, CA, it's time for me to reflect on six big topics that will be important during the show. Hybrid Cloud This general-purpose term seems to provide a comfort zone for many enterprise IT managers. It sounds reassuring to be able to work with one of the major public-cloud providers like AWS or Microsoft Azure while still maintaining an on-site presence.
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

SYS-CON Events announced today that Tintri Inc., a leading producer of VM-aware storage (VAS) for virtualization and cloud environments, will present at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Tintri VM-aware storage is the simplest for virtualized applications and cloud. Organizations including GE, Toyota, United Healthcare, NASA and 6 of the Fortune 15 have said “No to LUNs.” With Tintri they manag...
SYS-CON Events announced today that eCube Systems, the leading provider of modern development tools and best practices for Continuous Integration on OpenVMS, will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. eCube Systems offers a family of middleware products and development tools that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of So...
Apache Hadoop is a key technology for gaining business insights from your Big Data, but the penetration into enterprises is shockingly low. In fact, Apache Hadoop and Big Data proponents recognize that this technology has not yet achieved its game-changing business potential. In his session at 19th Cloud Expo, John Mertic, director of program management for ODPi at The Linux Foundation, will explain why this is, how we can work together as an open data community to increase adoption, and the i...
operations aren’t merging to become one discipline. Nor is operations simply going away. Rather, DevOps is leading software development and operations – together with other practices such as security – to collaborate and coexist with less overhead and conflict than in the past. In his session at @DevOpsSummit at 19th Cloud Expo, Gordon Haff, Red Hat Technology Evangelist, will discuss what modern operational practices look like in a world in which applications are more loosely coupled, are deve...
DevOps is a term that comes full of controversy. A lot of people are on the bandwagon, while others are waiting for the term to jump the shark, and eventually go back to business as usual. Regardless of where you are along the specturm of loving or hating the term DevOps, one thing is certain. More and more people are using it to describe a system administrator who uses scripts, or tools like, Chef, Puppet or Ansible, in order to provision infrastructure. There is also usually an expectation of...
DevOps is speeding towards the IT world like a freight train and the hype around it is deafening. There is no reason to be afraid of this change as it is the natural reaction to the agile movement that revolutionized development just a few years ago. By definition, DevOps is the natural alignment of IT performance to business profitability. The relevance of this has yet to be quantified but it has been suggested that the route to the CEO’s chair will come from the IT leaders that successfully ma...
This is a no-hype, pragmatic post about why I think you should consider architecting your next project the way SOA and/or microservices suggest. No matter if it’s a greenfield approach or if you’re in dire need of refactoring. Please note: considering still keeps open the option of not taking that approach. After reading this, you will have a better idea about whether building multiple small components instead of a single, large component makes sense for your project. This post assumes that you...
As software becomes more and more complex, we, as software developers, have been splitting up our code into smaller and smaller components. This is also true for the environment in which we run our code: going from bare metal, to VMs to the modern-day Cloud Native world of containers, schedulers and microservices. While we have figured out how to run containerized applications in the cloud using schedulers, we've yet to come up with a good solution to bridge the gap between getting your conta...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
DevOps theory promotes a culture of continuous improvement built on collaboration, empowerment, systems thinking, and feedback loops. But how do you collaborate effectively across the traditional silos? How can you make decisions without system-wide visibility? How can you see the whole system when it is spread across teams and locations? How do you close feedback loops across teams and activities delivering complex multi-tier, cloud, container, serverless, and/or API-based services?
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his session at @DevOpsSummit 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will show how customers are able to achieve a level of transparency that enables everyon...