Everyday SQL (10) 10 SQL interview MUST PREPARED topics and answers — Data Scientist SQL面试题最全题库

I found there are lots of resource for SQL interview questions. It can help you on refreshing your knowledge of sql. Go through the below list before you go to an interview which might test your sql skill.


1. 练习题库

题库一 https://leetcode.com/problemset/database/

题库二 https://www.hackerrank.com/domains/sql

If we would not use subquery, we can use temporary table. Temporary table’s performance is better than subquery, also better than WITH statement.

CREATE TEMPORARY TABLE X_record
DISTSTYLE KEY
DISTKEY(area_id)
SORTKEY(end_date)
AS (

);

2.Real interview question:

Make recommendations using the pages that your friends liked. Assume you have two tables: a two-column table (A) of users (user_id) and their friends (friend_id), and a two-column table of users (user_id) and the pages they liked (liked_page) . It should not recommend pages you already like.

This is the same question as below: get all the friends of friends who’s not my friend.

my solution:

select A.user_id, B.liked_page from A

inner join B on A.friend_id=B.user_id

left join B BB

on A.user_id=BB.user_id and B.liked_page=BB.liked_page and BB.user_id is null

Another possible solution:

SELECT A.user_id, B.liked_page
FROM A inner JOIN B ON A.friend_id = B.user_id
EXCEPT
SELECT user_id, liked_page
FROM B

3. What’s the difference between union and union all?

UNION will remove duplicate records, UNION ALL does not.

4. http://blog.bittiger.io/post198/

The self join is very interesting. You can spend a minute to think about it if you did not use it before.

5. JOIN

6. How to write a filter for a string with cat in TITLE column?

where TITLE like ‘%cat%’

7. NVL (expr1, expr2) if expre1 is null, then the targeted value would be expr2.

8. windowing function: rank() over (partition by XXX order by YYY)

lag(which_column_to_write, steps_a_number) over (PARTITION BY XXX ORDER BY XXX ASC) ——-lag is for previous

LEAD is for next

9. sum(CASE WHEN A=B THEN 1 when A>B then 2 else 0 end ) as summed_value

10. Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()

rank can have numbers like this 1,1,1,1,5,5,7 if there are numbers are the same. dense_rank will have 1,1,1,1,2,2,3 for the same data. Row_number would have 1,2,3,4,5,6,7 for the same data.

Leave a Comment