in Databases recategorized by
12,640 views
43 votes
43 votes

Consider a selection of the form $\sigma_{A\leq 100} (r)$, where $r$ is a relation with $1000$ tuples. Assume that the attribute values for $A$ among the tuples are uniformly distributed in the interval $[0, 500].$ Which one of the following options is the best estimate of the number of tuples returned by the given selection query ?

  1. $50$
  2. $100$
  3. $150$
  4. $200$
in Databases recategorized by
12.6k views

4 Comments

"Relational Algebra doesn't allow repetition"

Well, it's selection query, not Projection. Selection by default will never have duplicates, since all tuples have to be different.
0
0
If whole tuple will repeat then it will ignore repeating tuple. But Here may be more attributes apart from A...And here A is repeating but combination of other attribute with A may not repeat...bcz of this 200 is ans instead of 100.

 

If projection(A) has also given with this question then No doubt 100 would be answer.
0
0
edited by
Another way to solve this question

Probability density function $F(x) \  =  \frac{1}{500\  – \ 0} = \frac{1}{500}$

$\therefore P(x \le 100) = \int_{0}^{100} F(x) dx$

$P = \int_{0}^{100} \frac{1}{500} dx$

$P = \frac{1}{500}  \ [x]_{0}^{100}$

$P = \frac{100}{500} = \frac{1}{5}$

So, the total number of tuples $= NP$

$= 1000 \  \times  \ \frac{1}{5}$

$= 200$
11
11

5 Answers

45 votes
45 votes
Best answer
$\sigma_{A \leq 100}(r)$
 $r$ has $1000$ tuples

Values for A among the tuples are uniformly distributed in the interval $[0, 500].$ This can be split to $5$ mutually exclusive (non-overlapping) and exhaustive (no other intervals) intervals of same width of $100$ $([0-100], [101-200], [201-300], [301-400], [401-500],$ $0$ makes the first interval larger - this must be a typo in question) and we can assume all of them have same number of values due to Uniform distribution. So, number of tuples with A value in first interval should be

$\frac{\text{Total no. of tuples}}{5} = 1000/5 = 200$

Correct Answer: $D$
edited by

22 Comments

each tupple can choose number from 0 to 500.. So the probability of choosing a number from 0 to 100 for each tupple is 101 / 501..

So number of tupple return is = expected number of tupple choose number from 0 to 100

= 101/501 + 101/501..... 1000 times

around 200 tupples..

(D)
0
0
Numbers between 0 - 500 so number less than equal to 100 is 101(0- 100) . so answer should be 202.

why we take 1- 500 since closed bracket used so 0 and 500 are inclusive.
1
1
yes, but that must be a typo. And if we include 0, answer must be 101/501*1000. Also we cant assume values are integers.
6
6
Ok sir :)
0
0
how did you spilt [0-500]? why only 5 intervals ? why cant we have more intervals? didn't get it at all. please help!
0
0
@Arjun Sir selection of A <= 100 means only 100 tuples will be returned right? Because selection operator removes duplicates??
2
2

@ Sasidhar Gelivi  

I think you are confusing it with projection(\Pi).

1
1
yes that person is confusing
–2
–2
What's the meaning of typo  question ????
0
0
is RA allow duplicate ??

if not then how 200 will be the  ans ??
1
1
how did you spilt [0-500]? why only 5 intervals ? why cant we have more intervals?

didn't get it at all. please help!
0
0
since in question given...we have only nos. 0 to 500 to distribute among 1000 tuples for attribute A "UNIFORMLY". So if we use each element from 0 to 500 twice, we will have 1002 element which is suffice for distributing among 1000 tuples. we will use any of the 499 elements only and two times each. so, in this way we'll have 1000 values for A of relation r, distributed uniformly on 0 to 500.

so now when we'll select all the A values such that a<=100, we will get min 200(if 0 to 499 selected) and max 202(if 1 to 500 selected) such tuples.

so both the answers are correct, better go with the options now, whichever is given select it. if both given the select 202. bcz its the max possibility.
0
0

@ashwina did u get how split is made... i am also not getting this?

1
1

we've numbers [0,500] & tuples 1000. Now we've to uniformly or equally distribute this 501 numbers into 1000 tuples.

i.e each number will get, 1000/501 = 1.996 tuples.

i.e each number will get 2 tuples.( here floating point comes because of typo,if the number range would have [1,500] then we'll get perfectly 2 tuples for each number).

So the RA expression will select 101*2 = 202 tuples technically.

But here answer should be taken as 200 because of typo.

10
10
but tell me this

but in relational algebra, in the final table ,the tuples are distinct,so if each number is repeated two times....even through number of distinct tuples should be 100
1
1
edited by

It would be if it was $\pi_{A\leq 100} (r)$

but here it's selecting tuple,not a single attribute.

4
4

What if we choose interval like [0-50],[51-100].....[950-1000] the we would be having 10 intervals of size 50 each and each interval like others can contain a attribute value twice, then in this case we get vatue as 100 instead of 200 As

Total number of tuple /10 =1000/10 =100

@Abhinav Rana Sir

 

 

0
0

@MRINMOY_HALDER

Can u tell me one thing, if we can accommodate <=100 values , then why do we need [0,500] interval to accommodate it or vice versa??

 

 

0
0

why do we need [0,500] interval to accommodate it or vice versa

what is the meaning of this line ???

here A attribute can take value in the range (0 - 500), &  these 0 - 500 values will be distributed in 1000 places,that too uniformly(equal frequency of each values).

0
0
Then what is meaning of <=100??
0
0

totaltotal we have 1000

 Now we splitting [0-500]---> 0-250, 251-500

Combinely we have 0-250,251-500,0-1000 in the 200 tuple is <=100

0
0

Since we have 1000 tuples and values between 0 and 500.So we can distribute 0-499 initially.So by that time 500 tuples would have been completed.

Now the next value is 500 which is 501 st  tuple.Now if we try to wrap around and repeat same interval again,since we have 499 tuples remained with values possible from 0 to 499.

So for values A < 100 we can have initial 101 + final 101 =202 tuples in total.

So the best estimate matches 200 tuples?

Can we consider such a distribution?

0
0
27 votes
27 votes

option D

total numbers are 1000 and they have said that it is uniformly distrubuted between [0,500] it means every number is 2 times thats the only way we can distribute it uniformaly and as per our condition A<=100 at max 100 tuples can be there and and every one can be repeated 2 times so it sums up to 200 hence it is answer

4 Comments

@adhiti

Here it is not mentioned which attributes we are projecting, so even if the value of A may be same the other attribute's value may be different making the tuples different, hence 200 is the correct answer

0
0
does select operator removes redundant tuples?
0
0

@Jhaiyam A relation is a set and so it has no duplicates. Here, select with respect to relational algebra, it does eliminate duplicates.

1
1
3 votes
3 votes

may be this is best and simple method to understand this question

1 vote
1 vote
There must be typo in question. As clearly 1000 tuples written in question. And values of A are uniformly distributed.

By taking (0 500] , A values can be 0.5, 1,1.5.... 100, 100.5....499, 499.5, 500

Total 1000 values.

By which values <=100 are 200.
Answer:

Related questions