Baozi Training Leetcode solution 274: H-Index

包子面试培训

发布于 2019-12-12 16:35:03

6650

包子小道消息@12/07/2019

从growth at all cost 到profitability mindset，都是被Uber Wework 弄得。软银老板Masa的头发估计又得脱落几根，不过airbnb也许是2020年被大家追捧的一个新星，从cash at hand / total funds raised 的比例来看，就是下一个google，Facebook和zoom的节奏啊！

Leetcode solution 274: H-Index

Blogger:https://blog.baozitraining.org/2019/12/leetcode-solution-274-h-index.html

Youtube: https://youtu.be/USt5bmYjyZs

博客园: https://www.cnblogs.com/baozitraining/p/11993003.html

B站: https://www.bilibili.com/video/av78199423/

Problem Statement

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."

Example:

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had
             received 3, 0, 6, 1, 5 citations respectively.
             Since the researcher has 3 papers with at least 3 citations each and the remaining
             two with no more than 3 citations each, her h-index is 3.

Note: If there are several possible values for h, the maximum one is taken as the h-index.

Hint 1

An easy approach is to sort the array first.

Hint 2 What are the possible values of h-index?

Hint 3 A faster approach is to use extra space.

Problem link

Video Tutorial

You can find the detailed video tutorial here

Youtube
B站

Thought Process

It’s not the most straightforward question to understand I have to admit that. I normally start with small examples to help me understand

The requirement is: his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each

What happens if citation contains only 1 element?

Citations = [100] , N = 1, h = 1, N-h = 0

This gives us hint that h has an upper bound of N

What happens if citation contains only 2 elements? And we can assign some large citation values

Citations = [1000, 2000] , N = 2, h = 2. It seems h-index has not too much to do with the citation values itself, rather it’s about the index and the values.

We should now have a super brute force way, since h is bounded to be [0, N], we just go through each value and try to see if it could meet the h criteria (e.g., we start from N, then we see if there is at least N papers that have at least N citations) This would result in a O(N^2) time complexity.

Naturally, we normally can think of if we can sort the array to reduce the time complexity. After sort, using the below graph, the problem becomes find the citation more than the citation array index, and that citation index could be the h-index (that's why it's called h-index, not h-value, it's about index). This would give a O(NlgN) time complexity. Code is attached in the References section.

Finally, we can think about bucket sort since h-index is bounded from [0, N], and if paper citations are larger than N, it's just considered on counting on index N.

You know how to do a bucket sort ?

Solutions

 1 public int hIndex(int[] citations) {
 2     if (citations == null || citations.length == 0) {
 3         return 0;
 4     }
 5 
 6     // because we need from 0 to citations.length, h-index in [0, citations_length]
 7     int[] counts = new int[citations.length + 1];
 8 
 9     for (int i = 0; i < citations.length; i++) {
10         /*
11         if (citations[i] > citations.length) {
12             counts[citations.length] += 1;
13         } else {
14             counts[citations[i]] += Math.min(1, citations[i]);  // this is for 0 citation case
15         }
16         this 4 lines translate into below line, sounds niubility!
17          */
18         counts[Math.min(citations[i], citations.length)]++;
19 
20     }
21 
22     int citationCount = 0;
23     // Utilities.printIntArray(counts);
24     for (int i = counts.length - 1; i >= 0; i--) {
25         citationCount += counts[i];
26         if (i <= citationCount) {
27             return i;
28         }
29     }
30     // there is one and only one h-index, so this would never happen
31     return -1;
32 }

Use bucket sort

Time Complexity: O(N) since we went through citations only once

Space Complexity: O(N) since we are using an extra array to keep track of the count

References
Leetcode official solution (download pdf)

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2019-12-08，如有侵权请联系 cloudcommunity@tencent.com 删除

https

网络安全

本文分享自包子铺里聊IT 微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

https

网络安全

登录后参与评论

0 条评论

热度

Baozi Training Leetcode solution 274: H-Index