Bug 90885 - GCC should warn about 2^16 and 2^32 and 2^64 [-Wxor-used-as-pow]
Summary: GCC should warn about 2^16 and 2^32 and 2^64 [-Wxor-used-as-pow]
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 10.0
: P3 enhancement
Target Milestone: 13.0
Assignee: David Malcolm
URL:
Keywords: diagnostic, patch
Depends on:
Blocks: new-warning, new_warning
  Show dependency treegraph
 
Reported: 2019-06-14 08:06 UTC by Jonathan Wakely
Modified: 2023-04-27 15:39 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-06-14 00:00:00


Attachments
v1 of a patch to implement -Wxor-used-as-pow (4.45 KB, patch)
2022-08-11 14:29 UTC, David Malcolm
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Wakely 2019-06-14 08:06:11 UTC
GCC doesn't warn for:

short maxshort = 2^16;
int maxint = 2^32;
long maxlong = 2^64;

Sadly people actually do this:
https://twitter.com/jfbastien/status/1139298419988549632
https://twitter.com/mikemx7f/status/1139335901790625793

We probably want it for both C and C++.
Comment 1 Jonathan Wakely 2019-06-14 08:18:48 UTC
And maybe also 10^X where X is a literal.
https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=10+%5E&search=Search

A sample:

        tp->tv_sec  = attributes[0] / 10^9;
        tp->tv_nsec = attributes[0] % 10^9;
    
     range->pmin[TREND_FCT] = -10^10;
     range->pmax[TREND_MEAN] = 10^10;


    #define IRQ_CHINT5_LEVEL        (12 ^ 7)
    #define IRQ_CHINT4_LEVEL        (11 ^ 7)
    #define IRQ_CHINT3_LEVEL        (10 ^ 7)
    #define IRQ_CHINT2_LEVEL        (9 ^ 7)
    #define IRQ_CHINT1_LEVEL        (8 ^ 7)
Comment 2 Jonathan Wakely 2019-06-14 09:41:06 UTC
https://github.com/google/gvisor/pull/375
Comment 3 Jonathan Wakely 2019-06-14 11:06:05 UTC
	read_bytes(&f, (char *) &(val),
		   ( (n < (2 ^ 8))  ? 1 :
		     ( (n < (2 ^ 16)) ? 2 :
		       ( (n < (2 ^ 24)) ? 3 :
			 4 ) ) ) );
Comment 4 Richard Biener 2019-06-14 11:33:04 UTC
But there's nothing invalid about these constant expressions?  But yeah....
Comment 5 Richard Biener 2019-06-14 11:33:46 UTC
Maybe we should accept 2**32 as extension ;)
Comment 6 Jonathan Wakely 2019-06-14 11:36:10 UTC
There's nothing wrong about implicit fallthrough, misleading indentation, ambiguous else, or missing parentheses in nested logic expressions either. But people get it wrong all the time.

I can't see a good reason to write 2^16 when you mean 18, or 10^9 when you mean 3, so it's probably a bug. And there's an easy workaround to avoid the warning: just write the exact constant as a literal, not an XOR expression.
Comment 7 Yann Droneaud 2019-06-14 11:40:08 UTC
The issue was noted on twitter by John Regehr, in https://twitter.com/johnregehr/status/1139295920997068800 and following messages.

The warning was suggest again by John Regehr in https://twitter.com/johnregehr/status/1139302389612077056
Comment 8 Jonathan Wakely 2019-06-14 11:48:53 UTC
The right heuristic for the warning isn't entirely obvious though.

I think it should only warn when both operands are integer literals. Should all kinds of integer literals be treated equally? Is 0x11 ^ 0b0011 wrong? Maybe not as obviously as 2^8 and 2^32.

Do these mistakes only happen for powers of 2 and powers of 10? Is it worth warning about 3^4?

After some more searches I'm not even sure 10^ is common enough to worry about. Maybe it should only warn for 2 ^ integer-literal.

Warning X^X where the same literal is given twice probably makes sense, that would catch the 10^10 case in comment 1 (but not the -10^10 one).
Comment 9 Eric Gallager 2019-06-14 17:06:18 UTC
Confirmed. More discussion from that thread about possible heuristics for the warning: https://twitter.com/elwoz/status/1139522678396784642
* restricting it to just decimal literals probably makes sense, if someone is using the 0x or 0b prefix, they probably are in fact intending to do bit-twiddling with xor
* the "not from the expansion of <iso646.h>’s xor macro" criterion I can see possibly being a difficulty, due to how many other bugs there are about gcc's handling of macros from system headers...
Comment 10 Jonathan Wakely 2019-06-14 17:33:16 UTC
(In reply to Eric Gallager from comment #9)
> * the "not from the expansion of <iso646.h>’s xor macro" criterion I can see
> possibly being a difficulty, due to how many other bugs there are about
> gcc's handling of macros from system headers...

That's not relevant to C++ because xor is a keyword not a macro.
Comment 11 David Malcolm 2019-06-14 17:35:28 UTC
Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree with comment #6).

Not sure what to call it: "-Wexclusive-or"???

I think we'd want to *not* warn if either of the operands are from a macro expansion.

I think both operands ought to be decimal integers to trigger the warning.

I like the wording from comment #2: "2 ^ 30 is 28, not 1073741824.", to make it clear what's going on (I hope).

Other idea: fix-it hints.

So maybe something like:

t.c:10:5: warning: '2^30' is 28; did you mean '1<<30' (1073741824) [-Wexclusive-or]

  log.Infof("Setting total memory to %.2f GB", float64(args.TotalMem)/(2^30))
                                                                       ~^~~
                                                                       1<<30

or somesuch.
Comment 12 Jonathan Wakely 2019-06-14 17:42:07 UTC
(In reply to David Malcolm from comment #11)
> Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree
> with comment #6).
> 
> Not sure what to call it: "-Wexclusive-or"???

I suppose -Wxor is a bit cryptic-lookin'

What about -Wxor-used-as-pow ?

> I think we'd want to *not* warn if either of the operands are from a macro
> expansion.
> 
> I think both operands ought to be decimal integers to trigger the warning.

And not warn if the C++ 'xor' keyword is used, as nobody's going to think that "2 xor 8" means raising to the 8th power.

> I like the wording from comment #2: "2 ^ 30 is 28, not 1073741824.", to make
> it clear what's going on (I hope).

Yes, that's also how I phrased the various pull requests and bug reports I've submitted today.
Comment 13 Zack Weinberg 2019-06-15 02:30:17 UTC
Since examples of this error were observed with base 10, I think the warning should cover 10^i for decimal literal i, too.

Relatedly, “note: ^ performs exclusive or, not exponentiation” might be a nice addition to the existing error for ^ with a float for either operand.
Comment 14 Richard Biener 2019-06-17 08:57:11 UTC
(In reply to David Malcolm from comment #11)
> Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree
> with comment #6).
> 
> Not sure what to call it: "-Wexclusive-or"???

-Wexp[onential] or -Wpow[er]?
Comment 15 Thorsten Glaser 2019-06-17 14:19:45 UTC
-Wexp sounds like experimental
Comment 16 Eric Gallager 2019-06-17 16:25:07 UTC
I think David's original suggestion of -Wexclusive-or is the best name so far.
Comment 17 Marc Glisse 2019-06-17 17:27:50 UTC
(In reply to Jonathan Wakely from comment #12)
> What about -Wxor-used-as-pow ?

-Wxor-power (or -Wpower-xor)?
Comment 18 Dávid Bolvanský 2019-06-17 17:51:56 UTC
-Wxor-as-pow ? :)
Comment 19 Dominik 'disconnect3d' Czarnota 2019-06-17 18:26:38 UTC
Also what if:

1. Someone does it through DEFINEs as in:
```
#define COMPUTING_BASE 2
#define BITS 32
// and later use
COMPUTING_BASE ^ (BITS-1)
```
I guess we will warn.

2. Someone does it through constexpr variables in C++, as in:
```
constexpr int COMPUTING_BASE = 2;
constexpr int BITS = 32;
// and later use
COMPUTING_BASE ^ (BITS-1)
```
This probably happens on a different level than the above, so we probably won't warn as it doesn't use integer literals?

3. Someone *really wants it*?
Maybe there should be a way to inform the compiler, e.g. via a comment to suppress the warning for a given line? For example:
```
printf("%d\n", 2^32 /* explicit-xor */);
```

Offtopic: if you care about adding more warnings in tragic situations, you might also want to look at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88000 ;)
Comment 20 Dávid Bolvanský 2019-08-16 21:06:20 UTC
Clang implemented [0] this diagnostic under -Wxor-used-as-pow.
From user perspective it would be reasonable if GCC follows this naming.

[0] https://reviews.llvm.org/D63423
Comment 21 Eric Gallager 2019-11-16 05:38:03 UTC
(In reply to Dávid Bolvanský from comment #20)
> Clang implemented [0] this diagnostic under -Wxor-used-as-pow.
> From user perspective it would be reasonable if GCC follows this naming.
> 
> [0] https://reviews.llvm.org/D63423

ok I retract my previous support for -Wexclusive-or and now back -Wxor-used-as-pow instead.
Comment 22 David Binderman 2020-08-29 13:12:07 UTC
clang only seems to warn for 2 ^ X and 10 ^ Y.

There seems to be about 30 cases of this problem across the Fedora Linux
distribution, so not the biggest problem in the world.
Comment 23 Eric Gallager 2021-11-29 00:23:28 UTC
putting -Wxor-used-as-pow in the title since that's what clang went with
Comment 24 David Malcolm 2022-08-10 16:36:58 UTC
I'm working on an implementation of this.
Comment 25 David Malcolm 2022-08-11 14:29:28 UTC
Created attachment 53435 [details]
v1 of a patch to implement -Wxor-used-as-pow

This patch implements the warning, but doesn't work well; as noted in the text it's implemented in the parser, when I think it might have to be implemented in the lexer.

Attaching it here for reference (and as a backup for my hard drive).
Comment 26 David Malcolm 2022-08-12 01:40:40 UTC
I implemented a better version of the patch; I've posted it for review here:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599609.html
Comment 27 GCC Commits 2022-09-02 22:36:04 UTC
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:bedfca647a9e9c1adadd8924f3ee0ab4189424e0

commit r13-2386-gbedfca647a9e9c1adadd8924f3ee0ab4189424e0
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Fri Sep 2 18:29:33 2022 -0400

    c/c++: new warning: -Wxor-used-as-pow [PR90885]
    
    PR c/90885 notes various places in real-world code where people have
    written C/C++ code that uses ^ (exclusive or) where presumbably they
    meant exponentiation.
    
    For example
      https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search
    currently finds 11 places using "2^32", and all of them appear to be
    places where the user means 2 to the power of 32, rather than 2
    exclusive-orred with 32 (which is 34).
    
    This patch adds a new -Wxor-used-as-pow warning to the C and C++
    frontends to complain about ^ when the left-hand side is the decimal
    constant 2 or the decimal constant 10.
    
    This is the same name as the corresponding clang warning:
      https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow
    
    As per the clang warning, the warning suggests converting the left-hand
    side to a hexadecimal constant if you really mean xor, which suppresses
    the warning (though this patch implements a fix-it hint for that, whereas
    the clang implementation only has a fix-it hint for the initial
    suggestion of exponentiation).
    
    I initially tried implementing this without checking for decimals, but
    this version had lots of false positives.  Checking for decimals
    requires extending the lexer to capture whether or not a CPP_NUMBER
    token was decimal.  I added a new DECIMAL_INT flag to cpplib.h for this.
    Unfortunately, c_token and cp_tokens both have only an unsigned char for
    their flags (as captured by c_lex_with_flags), whereas this would add
    the 12th flag to cpp_tokens.  Of the first 8 flags, all but BOL are used
    in the C or C++ frontends, but BOL is not, so I moved that to a higher
    position, using its old value for the new DECIMAL_INT flag, so that it
    is representable within an unsigned char.
    
    Example output:
    
    demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow]
        5 | int t2_8 = 2^8;
          |             ^
          |            --
          |            1<<
    demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2)
        5 | int t2_8 = 2^8;
          |            ^
          |            0x2
    demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow]
       21 | int t10_6 = 10^6;
          |               ^
          |             ---
          |             1e
    demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10)
       21 | int t10_6 = 10^6;
          |             ^~
          |             0xa
    
    gcc/c-family/ChangeLog:
            PR c/90885
            * c-common.h (check_for_xor_used_as_pow): New decl.
            * c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate.
            * c-warn.cc (check_for_xor_used_as_pow): New.
            * c.opt (Wxor-used-as-pow): New.
    
    gcc/c/ChangeLog:
            PR c/90885
            * c-parser.cc (c_parser_string_literal): Clear ret.m_decimal.
            (c_parser_expr_no_commas): Likewise.
            (c_parser_conditional_expression): Likewise.
            (c_parser_binary_expression): Clear m_decimal when popping the
            stack.
            (c_parser_unary_expression): Clear ret.m_decimal.
            (c_parser_has_attribute_expression): Likewise for result.
            (c_parser_predefined_identifier): Likewise for expr.
            (c_parser_postfix_expression): Likewise for expr.
            Set expr.m_decimal when handling a CPP_NUMBER that was a decimal
            token.
            * c-tree.h (c_expr::m_decimal): New bitfield.
            * c-typeck.cc (parser_build_binary_op): Clear result.m_decimal.
            (parser_build_binary_op): Call check_for_xor_used_as_pow.
    
    gcc/cp/ChangeLog:
            PR c/90885
            * cp-tree.h (class cp_expr): Add bitfield m_decimal.  Clear it in
            existing ctors.  Add ctor that allows specifying its value.
            (cp_expr::decimal_p): New accessor.
            * parser.cc (cp_parser_expression_stack_entry::flags): New field.
            (cp_parser_primary_expression): Set m_decimal of cp_expr when
            handling numbers.
            (cp_parser_binary_expression): Extract flags from token when
            populating stack.  Call check_for_xor_used_as_pow.
    
    gcc/ChangeLog:
            PR c/90885
            * doc/invoke.texi (Warning Options): Add -Wxor-used-as-pow.
    
    gcc/testsuite/ChangeLog:
            PR c/90885
            * c-c++-common/Wxor-used-as-pow-1.c: New test.
            * c-c++-common/Wxor-used-as-pow-fixits.c: New test.
            * g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress
            -Wxor-used-as-pow.
            * g++.dg/warn/Wparentheses-10.C: Likewise.
            * g++.dg/warn/Wparentheses-18.C: Likewise.
            * g++.dg/warn/Wparentheses-19.C: Likewise.
            * g++.dg/warn/Wparentheses-9.C: Likewise.
            * g++.dg/warn/Wxor-used-as-pow-named-op.C: New test.
            * gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress
            -Wxor-used-as-pow.
            * gcc.dg/Wparentheses-7.c: Likewise.
            * gcc.dg/precedence-1.c: Likewise.
    
    libcpp/ChangeLog:
            PR c/90885
            * include/cpplib.h (BOL): Move macro to 1 << 12 since it is
            not used by C/C++'s unsigned char token flags.
            (DECIMAL_INT): New, using 1 << 6, so that it is visible as
            part of C/C++'s 8 bits of token flags.
    
    Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Comment 28 David Malcolm 2022-09-02 22:43:02 UTC
Implemented for GCC 13 by the above patch; marking as resolved.
Comment 29 Jonathan Wakely 2022-09-03 02:01:20 UTC
Excellent! Thanks, Dave
Comment 30 warp 2023-04-27 15:39:07 UTC
Note that a few of the examples in that first tweet are actually misleading. More particularly, the example in png.c. At first glance it looks like C and an example of this mistake, but if you look at the context it becomes clear that it actually isn't C at all (because that line appears in a context where it would be illegal code, namely, inside the initialization list of an array). It's actually BC code embedded in the C source code, for some reason. In BC 2^32 is legitimately 2 to the power of 32.

The other examples are probably legitimate, though.