Branch data Line data Source code
1 : : // Copyright (C) 2011 Internet Systems Consortium, Inc. ("ISC")
2 : : //
3 : : // Permission to use, copy, modify, and/or distribute this software for any
4 : : // purpose with or without fee is hereby granted, provided that the above
5 : : // copyright notice and this permission notice appear in all copies.
6 : : //
7 : : // THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
8 : : // REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
9 : : // AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
10 : : // INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
11 : : // LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
12 : : // OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
13 : : // PERFORMANCE OF THIS SOFTWARE.
14 : :
15 : : // $Id$
16 : :
17 : : #ifndef __RESPONSE_SCRUBBER_H
18 : : #define __RESPONSE_SCRUBBER_H
19 : :
20 : : /// \page DataScrubbing Data Scrubbing
21 : : /// \section DataScrubbingIntro Introduction
22 : : /// When a response is received from an authoritative server, it should be
23 : : /// checked to ensure that the data contained in it is valid. Signed data is
24 : : /// not a problem - validating the signatures is a sufficient check. But
25 : : /// unsigned data in a response is more of a problem. (Note that even data from
26 : : /// signed zones may be not be signed, e.g. delegations are not signed.) In
27 : : /// particular, how do we know that the server from which the response was
28 : : /// received was authoritive for the data it returned?
29 : : ///
30 : : /// The part of the code that checks for this is the "Data Scrubbing" module.
31 : : /// Although it includes the checking of IP addresses and ports, it is called
32 : : /// "Scrubbing" because it "scrubs" the returned message and removes doubtful
33 : : /// information.
34 : : ///
35 : : /// \section DataScrubbingBasic Basic Checks
36 : : /// The first part - how do we know that the response comes from the correct
37 : : /// server - is relatively trivial, albeit not foolproof (which is why DNSSEC
38 : : /// was developed). The following are checked:
39 : : ///
40 : : /// - The IP address from which the response was received is the same as the
41 : : /// one to which the query was sent.
42 : : /// - The port on which the response was received is the same as the one from
43 : : /// which the query was sent.
44 : : ///
45 : : /// (These tests need not not done for a TCP connection - if data is received
46 : : /// over the TCP stream, it is assumed that it comes from the address and port
47 : : /// to which a connection was made.)
48 : : ///
49 : : /// - The protocol used to send the question is the same as the protocol on
50 : : /// which an answer was received.
51 : : ///
52 : : /// (Strictly speaking, if this check fails it is a programming error - the
53 : : /// code should not mix up UPD and TCP messages.)
54 : : ///
55 : : /// - The QID in the response message is the same as the QID in the query
56 : : /// message sent.
57 : : ///
58 : : /// If the conditions are met, then the data - in all three response sections -
59 : : /// is scanned and out of bailiwick data is removed ("scrubbed").
60 : : ///
61 : : /// \section DataScrubbingBailiwick Bailiwick
62 : : /// Bailiwick means "district or jurisdiction of bailie or bailiff" (Concise
63 : : /// Oxford Dictionary, 7th Edition). It is not a term mentioned in any RFC
64 : : /// (or at least, any RFC up to RFC 5997) but is widely used in DNS literature.
65 : : /// In this context it is taken to mean the data for which a DNS server has
66 : : /// authority. So when we speak of the information being "in bailiwick", we
67 : : /// mean that the the server is the ultimate source of authority for that data.
68 : : ///
69 : : /// In practice, determining this from the response alone is difficult. In
70 : : /// particular, as a server may be authoritative for many zones, it could in
71 : : /// theory be authoritative for any combination of RRsets that appear in a
72 : : /// response.
73 : : ///
74 : : /// For this reason, bailiwick is dependent on the query. If, for example, a
75 : : /// query for www.example.com is sent to the nameservers for example.com
76 : : /// (because of a referral of from the com. servers), the bailiwick for the
77 : : /// query is example.com. This means that any information returned on domains
78 : : /// other than example.com may not be authoritative. More exactly, it may be
79 : : /// authoritative (because the server is also authoritative for the zone
80 : : /// concerned), but based on the information available (in this example, that
81 : : /// the response originated from a nameserver for the zone example.com) it is
82 : : /// not possible to be certain.
83 : : ///
84 : : /// Ideally, out of bailiwick data should be excluded from further processing
85 : : /// as it may be incorrect and corrupt the cache. In practice, there are
86 : : /// two cases to consider:
87 : : ///
88 : : /// The first is when the data has a qname that is not example.com or a
89 : : /// subdomain of it (e.g. xyz.com, www.example.net). In this case the data can
90 : : /// be retrieved by an independent query - no path from the root zone to the
91 : : /// data goes through the current bailiwick, so there is no chance of ending up
92 : : /// in a loop. In this case, data that appears to be out of bailiwick can be
93 : : /// dropped from the response.
94 : : ///
95 : : /// The second case is when the QNAME of the data is a subdomain of the
96 : : /// bailiwick. Here the server may or may not be authoritative for the data.
97 : : /// For example, if the name queried for were www.sub.example.com and the
98 : : /// example.com nameservers supplied an answer:
99 : : ///
100 : : /// - The answer could be authoritative - www.sub.example.com could be
101 : : /// in the example.com zone.
102 : : /// - The answer might not be authoritative - the zone sub.example.com may have
103 : : /// been delegated, so the authoritative answer should come from
104 : : /// sub.example.com's nameservers.
105 : : /// - The answer might be authoritative even though zone sub.example.com has
106 : : /// been delegated, because the nameserver for example.com is the same as
107 : : /// that for sub.example.com.
108 : : ///
109 : : /// Unlike the previous case, it is not possible to err on the side of caution
110 : : /// and drop such data. Any independent query for it will pass through the
111 : : /// current bailiwick and the same question will be asked again. For this
112 : : /// reason, any data in the response that has a QNAME equal to a subdomain of
113 : : /// the bailiwick has to be accepted.
114 : : ///
115 : : /// In summary then, data in a response that has a QNAME equal to or a subdomain
116 : : /// of the bailiwick is considered in-bailiwick. Anything else is out of of
117 : : /// bailiwick.
118 : : ///
119 : : /// \subsection DataScrubbingCrossSection Cross-Section Scrubbing
120 : : /// Even with the bailiwick checks above, there are some additional cleaning
121 : : /// that can be done with the packet. In particular:
122 : : ///
123 : : /// - The QNAMEs of the RRsets in the authority section must be equal to or
124 : : /// superdomains of a QNAME of an RRset in the answer. Any that are not
125 : : /// should be removed.
126 : : /// - If there is no answer section, the QNAMES of RRsets in the authority
127 : : /// section must be equal to or superdomains of the QNAME of the RRset in the
128 : : /// question.
129 : : ///
130 : : /// Although previous checks should have removed some inconsistencies, it
131 : : /// will not trap obscure cases (e.g. bailiwick: "example.com", answer:
132 : : /// "www.example.com", authority: sub.example.com). These checks do just that.
133 : : ///
134 : : /// (Note that not included here is QNAME of question not equal to or a
135 : : /// superdomain of the answer; that check is made in the ResponseClassifier
136 : : /// class.)
137 : : ///
138 : : /// \section DataScrubbingExample Examples
139 : : /// Some examples should make this clear: they all use the notation
140 : : /// Qu = Question, Zo = Zone being queried, An = Answer, Au = Authority,
141 : : /// Ad = Additional.
142 : : ///
143 : : /// \subsection DataScrubbingEx1 Example 1: Simple Query
144 : : /// Querying a nameserver for the zone "example.com" for www.example.com and
145 : : /// receiving the answer "www.example.com A 1.2.3.4" with two nameservers quoted
146 : : /// as authority and both their addresses in the additional section:
147 : : ///
148 : : /// Qu: www.example.com\n
149 : : /// Zo: example.com
150 : : ///
151 : : /// An: www.example.com A 192.0.2.1
152 : : ///
153 : : /// Au(1): example.com NS ns0.example.com\n
154 : : /// Au(2): example.com NS ns1.example.net
155 : : ///
156 : : /// Ad(1): ns0.example.com A 192.0.2.100\n
157 : : /// Ad(2): ns1.example.net A 192.0.2.200
158 : : ///
159 : : /// This answer could be returned by a properly configured server. All resource
160 : : /// records in the answer - with the exception of Ad(2) - are in bailiwick
161 : : /// because the QNAME is equal to or a subdomain of the zone being queried.
162 : : ///
163 : : /// It is permissible for Ad(2) to be returned by a properly configured server
164 : : /// as a hint to resolvers. However the example.com nameservers are not
165 : : /// authoritative for addresses of domains in example.net; that record could
166 : : /// be out of date or incorrect. Indeed, it might even be a deliberate attempt
167 : : /// at a spoof by getting us to cache an invalid address for ns1.example.net.
168 : : /// The safest thing to do is to drop the A record and to get the address of
169 : : /// ns1.example.net by querying for that name through the .net nameservers.
170 : : ///
171 : : /// \subsection DataScrubbingEx2 Example 2: Multiple Zones on Same Nameserver
172 : : /// Assume now that example.com and sub.example.com are hosted on the same
173 : : /// nameserver and that from the .com zone the resolver has received a referral
174 : : /// to example.com. Suppose that the query is for www.sub.example.com and that
175 : : /// the following response is received:
176 : : ///
177 : : /// Qu: www.sub.example.com\n
178 : : /// Zo: example.com
179 : : ///
180 : : /// An: (nothing)
181 : : ///
182 : : /// Au(1): sub.example.com NS ns0.sub.example.com\n
183 : : /// Au(2): sub.example.com NS ns1.example.net
184 : : ///
185 : : /// Ad(1): ns0.sub.example.com A 192.0.2.101\n
186 : : /// Ad(2): ns1.example.net A 192.0.2.201
187 : : ///
188 : : /// Although we asked the example.com nameservers for information, we got the
189 : : /// nameservers for sub.example.com in the authority section. This is valid
190 : : /// because if BIND-10 hosts multiple zones, it will look up the data in the
191 : : /// zone that most closely matches the query.
192 : : ///
193 : : /// Using the criteria above, the data in the additional section can therefore
194 : : /// be regarded as in bailiwick because sub.example.com is a subdomain of
195 : : /// example.com. As before though, the address for ns1.example.net in the
196 : : /// additional section is not in bailiwick because ns1.example.net is now a
197 : : /// subdomain of example.com.
198 : : ///
199 : : /// \subsection DataScrubbingEx3 Example 3: Deliberate Spoof Attempt
200 : : /// Qu: www.example.com\n
201 : : /// Zo: example.com
202 : : ///
203 : : /// An: www.example.com A 192.0.2.1
204 : : ///
205 : : /// Au(1): com NS ns0.example.com\n
206 : : /// Au(2): com NS ns1.example.net
207 : : ///
208 : : /// Ad(1): ns0.example.com A 192.0.2.100\n
209 : : /// Ad(2): ns1.example.net A 192.0.2.200
210 : : ///
211 : : /// This is a deliberately invalid response. The query is being sent to the
212 : : /// nameservers for example.com (presumably because a referral to example.com
213 : : /// was received from the com nameservers), but the response is an attempt
214 : : /// to get the specified nameservers cached as the nameservers for com - for
215 : : /// which example.com is not authoritative.
216 : : ///
217 : : /// Note though that this response is only invalid because, due to the previous
218 : : /// referral, the query was sent to the example.com nameservers. Had the
219 : : /// referral been to the com nameservers, it would be a valid response; the com
220 : : /// zone could well be serving all the data for example.com. Having said that,
221 : : /// the A record for ns1.example.net would still be regarded as being out of
222 : : /// bailiwick becase the nameserver is not authoritative for the .net zone.
223 : : ///
224 : : /// \subsection DataScrubbingEx4 Example 4: Inconsistent Answer Section
225 : : /// Qu: www.example.com\n
226 : : /// Zo: example.com
227 : : ///
228 : : /// An: www.example.com A 192.0.2.1
229 : : ///
230 : : /// Au(1): alpha.example.com NS ns0.example.com\n
231 : : /// Au(2): alpha.example.com NS ns1.example.net
232 : : ///
233 : : /// Ad(1): ns0.example.com A 192.0.2.100\n
234 : : /// Ad(2): ns1.example.net A 192.0.2.200
235 : : ///
236 : : /// Here, everything in the answer and authority sections is in bailiwick for
237 : : /// the example.com server. And although the zone example.com was queried, it
238 : : /// is permissible for the authority section to contain nameservers with a
239 : : /// qname that is a subdomain of example.com (e.g. see \ref DataScrubbingEx2).
240 : : /// However, only servers with a qname that is equal to or a superdomain of
241 : : /// the answer are authoritative for the answer. So in this case, both
242 : : /// Au(1) and Au(2) (as well as Ad(2), for reasons given earlier) will be
243 : : /// scrubbed.
244 : :
245 : : #include <config.h>
246 : : #include <asiolink/io_endpoint.h>
247 : : #include <dns/message.h>
248 : : #include <dns/name.h>
249 : :
250 : : /// \brief Response Data Scrubbing
251 : : ///
252 : : /// This is the class that implements the data scrubbing. Given a response
253 : : /// message and some additional information, it checks the information using
254 : : /// the rules given in \ref DataScrubbing and either rejects the packet or
255 : : /// modifies it to remove non-conforming RRsets.
256 : : ///
257 : : /// TODO: Examine the additional records and remove all cases where the
258 : : /// QNAME does not match the RDATA of records in the authority section.
259 : :
260 : : class ResponseScrubber {
261 : : public:
262 : :
263 : : /// \brief Response Code for Address Check
264 : : enum Category {
265 : : SUCCESS = 0, ///< Packet is OK
266 : :
267 : : // Error categories
268 : :
269 : : ADDRESS = 1, ///< Mismatching IP address
270 : : PORT = 2, ///< Mismatching port
271 : : PROTOCOL = 3 ///< Mismatching protocol
272 : : };
273 : :
274 : : /// \brief Check IP Address
275 : : ///
276 : : /// Compares the address to which the query was sent, the port it was
277 : : /// sent from, and the protocol used for communication with the (address,
278 : : /// port, protocol) from which the response was received.
279 : : ///
280 : : /// \param to Endpoint representing the address to which the query was sent.
281 : : /// \param from Endpoint from which the response was received.
282 : : ///
283 : : /// \return SUCCESS if the two endpoints match, otherwise an error status
284 : : /// indicating what was incorrect.
285 : : static Category addressCheck(const isc::asiolink::IOEndpoint& to,
286 : : const isc::asiolink::IOEndpoint& from);
287 : :
288 : : /// \brief Check QID
289 : : ///
290 : : /// Compares the QID in the sent message with the QID in the response.
291 : : ///
292 : : /// \param sent Message sent to the authoritative server
293 : : /// \param received Message received from the authoritative server
294 : : ///
295 : : /// \return true if the QIDs match, false otherwise.
296 : : static bool qidCheck(const isc::dns::Message& sent,
297 : : const isc::dns::Message& received) {
298 [ + - ][ + - ]: 2 : return (sent.getQid() == received.getQid());
[ + - ][ + - ]
299 : : }
300 : :
301 : : /// \brief Generalised Scrub Message Section
302 : : ///
303 : : /// When scrubbing a message given the bailiwick of the server, RRsets are
304 : : /// retained in the message section if the QNAME is equal to or a subdomain
305 : : /// of the bailiwick. However, when checking QNAME of RRsets in the
306 : : /// authority section against the QNAME of the question or answers, RRsets
307 : : /// are retained only if their QNAME is equal to or a superdomain of the
308 : : /// name in question.
309 : : ///
310 : : /// This method provides the generalised scrubbing whereby the RRsets in
311 : : /// a section are tested against a given name, and RRsets kept if their
312 : : /// QNAME is equal to or in the supplied relationship with the given name.
313 : : ///
314 : : /// \param section Section of the message to be scrubbed.
315 : : /// \param names Names against which RRsets should be checked. Note that
316 : : /// this is a vector of pointers to Name objects; they are assumed to
317 : : /// independently exist, and the caller retains ownership of them and is
318 : : /// assumed to destroy them when needed.
319 : : /// \param connection Relationship required for retention, i.e. the QNAME of
320 : : /// an RRset in the specified section must be equal to or a "connection"
321 : : /// (SUPERDOMAIN/SUBDOMAIN) of "name" for the RRset to be retained.
322 : : /// \param message Message to be scrubbed.
323 : : ///
324 : : /// \return Count of the number of RRsets removed from the section.
325 : : static unsigned int scrubSection(isc::dns::Message& message,
326 : : const std::vector<const isc::dns::Name*>& names,
327 : : const isc::dns::NameComparisonResult::NameRelation connection,
328 : : const isc::dns::Message::Section section);
329 : :
330 : : /// \brief Scrub All Sections of a Message
331 : : ///
332 : : /// Scrubs each of the answer, authority and additional sections of the
333 : : /// message.
334 : : ///
335 : : /// No distinction is made between RRsets legitimately in the message (e.g.
336 : : /// glue for authorities that are not in bailiwick) and ones that could be
337 : : /// considered as attempts of spoofing (e.g. non-bailiwick RRsets in the
338 : : /// additional section that are not related to the query).
339 : : ///
340 : : /// The resultant packet returned to the caller may be invalid. If so, it
341 : : /// is up to the caller to detect that.
342 : : ///
343 : : /// \param message Message to be scrubbed.
344 : : /// \param bailiwick Name of the zone whose authoritative servers were
345 : : /// queried.
346 : : ///
347 : : /// \return Count of the number of RRsets removed from the message.
348 : : static unsigned int scrubAllSections(isc::dns::Message& message,
349 : : const isc::dns::Name& bailiwick);
350 : :
351 : : /// \brief Scrub Across Message Sections
352 : : ///
353 : : /// Does some cross-section comparisons and removes inconsistent RRs. In
354 : : /// particular it:
355 : : ///
356 : : /// - If an answer is present, checks that the qname of the authority RRs
357 : : /// are equal to or superdomain of the qname answer RRsets. Any that are
358 : : /// not are removed.
359 : : /// - If an answer is not present, checks that the authority RRs are
360 : : /// equal to or superdomains of the question. If not, the authority RRs
361 : : /// are removed.
362 : : ///
363 : : /// Note that the scrubbing does not check:
364 : : ///
365 : : /// - that the question is in the bailiwick of the server; that check is
366 : : /// assumed to have been done prior to the query being sent (else why
367 : : /// was the query sent there in the first place?)
368 : : /// - that the qname of one of the RRsets in the answer (if present) is
369 : : /// equal to the qname of the question (that check is done in the
370 : : /// response classification code).
371 : : ///
372 : : /// \param message Message to be scrubbed.
373 : : ///
374 : : /// \return Count of the number of RRsets removed from the section.
375 : : static unsigned int scrubCrossSections(isc::dns::Message& message);
376 : :
377 : : /// \brief Main Scrubbing Entry Point
378 : : ///
379 : : /// The single entry point to the module to sanitise the message. All
380 : : /// it does is call the various other scrubbing methods.
381 : : ///
382 : : /// \param message Pointer to the message to be scrubbed. (This is a
383 : : /// pointer - as opposed to a Message as in other methods in this class -
384 : : /// as the external code is expected to be mainly using message pointers
385 : : /// to access messages.)
386 : : /// \param bailiwick Name of the zone whose authoritative servers were
387 : : /// queried.
388 : : ///
389 : : /// \return Count of the number of RRsets removed from the message.
390 : : static unsigned int scrub(const isc::dns::MessagePtr& message,
391 : : const isc::dns::Name& bailiwick);
392 : :
393 : : /// \brief Comparison Function for Sorting Name Pointers
394 : : ///
395 : : /// Utility method called to sorts pointers to names in lexical order.
396 : : ///
397 : : /// \param n1 Pointer to first Name object
398 : : /// \param n2 Pointer to second Name object
399 : : ///
400 : : /// \return true if n1 is less than n2, false otherwise.
401 : 0 : static bool compareNameLt(const isc::dns::Name* n1,
402 : : const isc::dns::Name* n2)
403 : : {
404 : 0 : return (*n1 < *n2);
405 : : }
406 : :
407 : : /// \brief Function for Comparing Name Pointers
408 : : ///
409 : : /// Utility method called to sorts pointers to names in lexical order.
410 : : ///
411 : : /// \param n1 Pointer to first Name object
412 : : /// \param n2 Pointer to second Name object
413 : : ///
414 : : /// \return true if n1 is equal to n2, false otherwise.
415 : 0 : static bool compareNameEq(const isc::dns::Name* n1,
416 : : const isc::dns::Name* n2)
417 : : {
418 : 0 : return (*n1 == *n2);
419 : : }
420 : : };
421 : :
422 : : #endif // __RESPONSE_SCRUBBER_H
|