Uafhentet: Are arbejder kl Hive ?
Hive er et alt-i-et projektstyringsværktøj udviklet til at "hjælpe teams med at bevæge sig hurtigere", uanset hvordan de arbejder. Funktioner oprettes baseret på brugernes ønsker og opdateres ugentligt, hvilket gør Hive til verdens første demokratiske softwareplatform. Det er bedst kendt for sine muligheder inden for projektstyring, tidsstyring, teamsamarbejde, automatisering og en række integrationer med tredjepartssoftware. Hive er gratis at bruge for solo-brugere og med premium-versioner tilgængelige for teams og virksomheder.
( 1 )
| Capabilities |
|
|---|---|
| Segment |
|
| Deployment | Cloud / SaaS / Web-baseret, mobil Android, mobil iPad, mobil iPhone |
| Støtte | 24/7 (liverepræsentant), chat, e-mail/helpdesk, ofte stillede spørgsmål/forum, vidensbase, telefonisk support |
| Kurser | Dokumentation |
| Other languages | Engelsk |
Sammenlign Hive med andre populære værktøjer i samme kategori.
If you are data analyst and expert in SQL then use Hive. Hive is very easy to work with especially if you are a SQL person. I use both hive and pig at work. I use hive mainly for ad hoc quires and reports. For BI reports Hive is the best since you can reuse all the SQL that you have done for traditional data warehouses. Also with Hive Server2 you get a real JDBC support so you can plug your BI tools to it. Many more SQL features like cubes, rollups, windowing, lag, lead, etc are being added to Hive through Hortonworks Stinger initiative. Hive also produces very compact code, which is always good for reading and debugging.
I would suggest to use hive for large projects, where you want to implement SQL-like data access, schemas, metadata, partitions, server-based deployment, jdbc, etc. Pig is a good language and can be very handy for immediate tasks or small projects. i would recommend PIG for small projects .
Hive Hadoop provides the users with strong and powerful statistics functions. Hive Hadoop is like SQL, so for any SQL developer the learning curve for Hive will almost be negligible. Hive Hadoop can be integrated with HBase for querying the data in HBase whereas this is not possible with Pig. In case of Pig, a function named HbaseStorage () will be used for loading the data from HBase. Hive Hadoop has gained popularity as it is supported by Hue. Hive Hadoop has various user groups such as CNET, Facebook, and Digg and so on.
If you know how to write sql statements you can write hiveql and it doesn't require you to learn anything new,its pretty straightforward
Performance tuning is difficult and becomes hard for complex queries, it still has a few bugs like all the data going to single reducer, which might lead to slow down the query results.
For developing reports for business analysts, lot of them know sql statements so its easy to write and pull information for analysis
- Easy to learn - Can query complex data including nested structures. - Flexible (wrt data schema) - With ORC SerDe, I/O can be reduced drastically - by reading only what is required (columnar formats).
- Needs schema to be defined in prior. - Not ANSI SQL compliant. - Not suitable for fast interactive queries, even on moderate size datasets. - Works only with Hadoop (not an independent query-processing tool) - Not enterprise grade w.r.t quality of documentation, error messages, support
Exploring ways to store and process semantic datasets
Hive can tell us the detailed progress of a query, and can incorporate UDFs in different languages
The query speed is way to slow, and it does not support positional arguments in GROUP BY and ORDER BY
We use Hive to run our nightly workflows on HDFS in batch for data aggregation and analysis.
Hive syntax is almost exactly like sql, so for someone already familiar with sql it takes almost no effort to pick up hive. It can perform a wide variety of analyses over very large sets of data and requires very little tuning if you are willing to wait a while for the results.
Hive can be a bit slow in comparison to other languages like Pig. It also does not have as rich of a scripting language. This is what makes it the second choice language for most data analysis jobs at LinkedIn.
We are trying to mine data from massive data sets for a wide variety of purposes (debugging production issues, creating business metrics, models, and forecasts among other things). We have been able to do this very easily using our data warehouse and a combo of hive and Pig.
Hive its a data warehousing infrastructure built on top of Hadoop to provide data grouping, querying, and analysis.Apache Hive soporta el análisis de grandes conjuntos de datos almacenados bajo HDFS de Hadoop y en sistemas compatibles como el sistema de archivos Amazon.It offers a SQL-based query language called HiveQL5 with schemas to transparently read and convert queries in MapReduce, Apache Tez6, and Spark tasks. All three execution engines can run under YARN. To speed up queries, Hive provides indexes, which include bitmap indexes.
Offers many tools, has great growth potential
Possibility of storing metadata in an organized and easily accessible way.
Hive syntax is almost like sql, so for someone already familiar with sql it takes almost no effort to pick up hive. But there are other tools that can do the same thing faster these days. Hive initially was really good to have; but more and more projects are now available to do SQL like operations on Big Data (like Drill).
Hive is comparatively slower than its competitors. Its easy to use but that comes with the cost of processing, If you are using it just for batch processing then hive is well and fine. It also does not have as rich of a scripting language.
In Retail, the business partners are more comfortable querying their own data instead of relying on Engineers. Hive solves one of that problems. The main purpouse of using Hive is to building reports and do analysis of data that is stored in the Hadoop file system.
nothing in particular. helps us with big data and allows all users to have unrestricted bandwidth, but we already ran into issues with that, so now one of the servers has limitations.
. at my company it was fairly troublesome getting access since it's underlying warehouseing is in hadoop, then have to connect through hive
data insights with big browser data through mapreduce
Easy SQL like syntax for very short and simple queries
No alias for relation. No flow controls as well.
I build machine learning model for online advertising system. Hive to me is more like a ad-hoc query engine rather than a platform where I can develop complex algorithm on