- My Forums
- Tiger Rant
- LSU Recruiting
- SEC Rant
- Saints Talk
- Pelicans Talk
- More Sports Board
- Fantasy Sports
- Golf Board
- Soccer Board
- O-T Lounge
- Tech Board
- Home/Garden Board
- Outdoor Board
- Health/Fitness Board
- Movie/TV Board
- Book Board
- Music Board
- Political Talk
- Money Talk
- Fark Board
- Gaming Board
- Travel Board
- Food/Drink Board
- Ticket Exchange
- TD Help Board
Customize My Forums- View All Forums
- Show Left Links
- Topic Sort Options
- Trending Topics
- Recent Topics
- Active Topics
Started By
Message
re: What should I use to play with over 400 million rows of data?
Posted on 1/10/17 at 9:55 pm to BoogaBear
Posted on 1/10/17 at 9:55 pm to BoogaBear
quote:
full enterprise-level
Highly unlikely he has the hardware to support it.
To be clear, I installed Oracle Enterprise 12c on a Surface Pro 3. I'm not claiming to run a large corporation on this installation, I use it for demos to potential clients. Demos don't usually require 500 million row queries unless the potential client is seriously interested and when they do I don't use my Surface for that.
>>It sounds like all of your data is being dumped into tables and it's not very usable. Are the tables indexed at all?
Actually, on the Surface I'm using indexes and partitioning. Everything is running exactly as delivered for demonstration purposes, but not with the hardware to support a major delivery. For demos to stir initial interest I don't need to deploy major hardware, but it's easier to just go with enterprise Oracle to start with to avoid any version problems so that everything runs on real proven software for a demo. For volume testing I don't use the Surface obviously but the software is exactly the same.
full enterprise-level
Highly unlikely he has the hardware to support it.
To be clear, I installed Oracle Enterprise 12c on a Surface Pro 3. I'm not claiming to run a large corporation on this installation, I use it for demos to potential clients. Demos don't usually require 500 million row queries unless the potential client is seriously interested and when they do I don't use my Surface for that.
>>It sounds like all of your data is being dumped into tables and it's not very usable. Are the tables indexed at all?
Actually, on the Surface I'm using indexes and partitioning. Everything is running exactly as delivered for demonstration purposes, but not with the hardware to support a major delivery. For demos to stir initial interest I don't need to deploy major hardware, but it's easier to just go with enterprise Oracle to start with to avoid any version problems so that everything runs on real proven software for a demo. For volume testing I don't use the Surface obviously but the software is exactly the same.
Posted on 1/11/17 at 1:40 am to foshizzle
quote:
Are the tables indexed at all?
Is this something I can do myself? There's someone in the IT dept that takes 3 weeks or more to get anything done.
Posted on 1/11/17 at 5:11 am to Kujo
If you have the permissions to do so, yes you can. I doubt that you do being an end user.
Open your query that is taking a while and instead of running it in SQL developer, hit F10 to run the explain plan.
You can post the results here or email them to me if you'd like. My suspicion is that you have some full table scans or nested loops. This will help you narrow down what should be indexed.
Does your company not have a data warehouse?
400 million rows is technically a lot of data but it's not unmanageable, we have some stuff that processes 50 million a day. Some have around 19 billion records.
Open your query that is taking a while and instead of running it in SQL developer, hit F10 to run the explain plan.
You can post the results here or email them to me if you'd like. My suspicion is that you have some full table scans or nested loops. This will help you narrow down what should be indexed.
Does your company not have a data warehouse?
400 million rows is technically a lot of data but it's not unmanageable, we have some stuff that processes 50 million a day. Some have around 19 billion records.
Posted on 1/11/17 at 10:07 pm to Kujo
quote:
I can play with one vehicle's data for a day, and give a report about that data, or use it in an investigation into a driver or something "pinpoint".....but when I look at an entire day of all vehicles....I crash before I can begin.
For this kind of data exploration, I'll pitch in another vote for R. There's a learning curve, you bet, but you'll amaze your friends (and bosses) with the kinds of plots and graphs you can crank out on the fly, once you get a handle on how it works.
Popular
Back to top
