Started By
Message

re: What should I use to play with over 400 million rows of data?

Posted on 1/10/17 at 9:55 pm to
Posted by foshizzle
Washington DC metro
Member since Mar 2008
40599 posts
Posted on 1/10/17 at 9:55 pm to
quote:
full enterprise-level


Highly unlikely he has the hardware to support it.


To be clear, I installed Oracle Enterprise 12c on a Surface Pro 3. I'm not claiming to run a large corporation on this installation, I use it for demos to potential clients. Demos don't usually require 500 million row queries unless the potential client is seriously interested and when they do I don't use my Surface for that.

>>It sounds like all of your data is being dumped into tables and it's not very usable. Are the tables indexed at all?

Actually, on the Surface I'm using indexes and partitioning. Everything is running exactly as delivered for demonstration purposes, but not with the hardware to support a major delivery. For demos to stir initial interest I don't need to deploy major hardware, but it's easier to just go with enterprise Oracle to start with to avoid any version problems so that everything runs on real proven software for a demo. For volume testing I don't use the Surface obviously but the software is exactly the same.
Posted by Kujo
225-911-5736
Member since Dec 2015
6031 posts
Posted on 1/11/17 at 1:40 am to
quote:

Are the tables indexed at all?


Is this something I can do myself? There's someone in the IT dept that takes 3 weeks or more to get anything done.
Posted by BoogaBear
Member since Jul 2013
6457 posts
Posted on 1/11/17 at 5:11 am to
If you have the permissions to do so, yes you can. I doubt that you do being an end user.

Open your query that is taking a while and instead of running it in SQL developer, hit F10 to run the explain plan.

You can post the results here or email them to me if you'd like. My suspicion is that you have some full table scans or nested loops. This will help you narrow down what should be indexed.

Does your company not have a data warehouse?

400 million rows is technically a lot of data but it's not unmanageable, we have some stuff that processes 50 million a day. Some have around 19 billion records.
Posted by tokenBoiler
Lafayette, Indiana
Member since Aug 2012
4824 posts
Posted on 1/11/17 at 10:07 pm to
quote:

I can play with one vehicle's data for a day, and give a report about that data, or use it in an investigation into a driver or something "pinpoint".....but when I look at an entire day of all vehicles....I crash before I can begin.



For this kind of data exploration, I'll pitch in another vote for R. There's a learning curve, you bet, but you'll amaze your friends (and bosses) with the kinds of plots and graphs you can crank out on the fly, once you get a handle on how it works.
first pageprev pagePage 2 of 2Next pagelast page
refresh

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on X, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookXInstagram