90885 – GCC should warn about 2^16 and 2^32 and 2^64 [-Wxor-used-as-pow]

Bug 90885 - GCC should warn about 2^16 and 2^32 and 2^64 [-Wxor-used-as-pow]

Summary: GCC should warn about 2^16 and 2^32 and 2^64 [-Wxor-used-as-pow]

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c (show other bugs)
Version:	10.0

Importance:	P3 enhancement
Target Milestone:	13.0
Assignee:	David Malcolm

URL:
Keywords:	diagnostic, patch

Depends on:
Blocks:	new-warning, new_warning
	Show dependency tree / graph

Reported:	2019-06-14 08:06 UTC by Jonathan Wakely
Modified:	2023-04-27 15:39 UTC (History)
CC List:	9 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2019-06-14 00:00:00

Attachments
v1 of a patch to implement -Wxor-used-as-pow (4.45 KB, patch) 2022-08-11 14:29 UTC, David Malcolm	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jonathan Wakely 2019-06-14 08:06:11 UTC

GCC doesn't warn for:

short maxshort = 2^16;
int maxint = 2^32;
long maxlong = 2^64;

Sadly people actually do this:
https://twitter.com/jfbastien/status/1139298419988549632
https://twitter.com/mikemx7f/status/1139335901790625793

We probably want it for both C and C++.

Comment 1 Jonathan Wakely 2019-06-14 08:18:48 UTC

And maybe also 10^X where X is a literal.
https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=10+%5E&search=Search

A sample:

        tp->tv_sec  = attributes[0] / 10^9;
        tp->tv_nsec = attributes[0] % 10^9;
    
     range->pmin[TREND_FCT] = -10^10;
     range->pmax[TREND_MEAN] = 10^10;


    #define IRQ_CHINT5_LEVEL        (12 ^ 7)
    #define IRQ_CHINT4_LEVEL        (11 ^ 7)
    #define IRQ_CHINT3_LEVEL        (10 ^ 7)
    #define IRQ_CHINT2_LEVEL        (9 ^ 7)
    #define IRQ_CHINT1_LEVEL        (8 ^ 7)

Comment 2 Jonathan Wakely 2019-06-14 09:41:06 UTC

https://github.com/google/gvisor/pull/375

Comment 3 Jonathan Wakely 2019-06-14 11:06:05 UTC

	read_bytes(&f, (char *) &(val),
		   ( (n < (2 ^ 8))  ? 1 :
		     ( (n < (2 ^ 16)) ? 2 :
		       ( (n < (2 ^ 24)) ? 3 :
			 4 ) ) ) );

Comment 4 Richard Biener 2019-06-14 11:33:04 UTC

But there's nothing invalid about these constant expressions?  But yeah....

Comment 5 Richard Biener 2019-06-14 11:33:46 UTC

Maybe we should accept 2**32 as extension ;)

Comment 6 Jonathan Wakely 2019-06-14 11:36:10 UTC

There's nothing wrong about implicit fallthrough, misleading indentation, ambiguous else, or missing parentheses in nested logic expressions either. But people get it wrong all the time.

I can't see a good reason to write 2^16 when you mean 18, or 10^9 when you mean 3, so it's probably a bug. And there's an easy workaround to avoid the warning: just write the exact constant as a literal, not an XOR expression.

Comment 7 Yann Droneaud 2019-06-14 11:40:08 UTC

The issue was noted on twitter by John Regehr, in https://twitter.com/johnregehr/status/1139295920997068800 and following messages.

The warning was suggest again by John Regehr in https://twitter.com/johnregehr/status/1139302389612077056

Comment 8 Jonathan Wakely 2019-06-14 11:48:53 UTC

The right heuristic for the warning isn't entirely obvious though.

I think it should only warn when both operands are integer literals. Should all kinds of integer literals be treated equally? Is 0x11 ^ 0b0011 wrong? Maybe not as obviously as 2^8 and 2^32.

Do these mistakes only happen for powers of 2 and powers of 10? Is it worth warning about 3^4?

After some more searches I'm not even sure 10^ is common enough to worry about. Maybe it should only warn for 2 ^ integer-literal.

Warning X^X where the same literal is given twice probably makes sense, that would catch the 10^10 case in comment 1 (but not the -10^10 one).

Comment 9 Eric Gallager 2019-06-14 17:06:18 UTC

Confirmed. More discussion from that thread about possible heuristics for the warning: https://twitter.com/elwoz/status/1139522678396784642
* restricting it to just decimal literals probably makes sense, if someone is using the 0x or 0b prefix, they probably are in fact intending to do bit-twiddling with xor
* the "not from the expansion of <iso646.h>’s xor macro" criterion I can see possibly being a difficulty, due to how many other bugs there are about gcc's handling of macros from system headers...

Comment 10 Jonathan Wakely 2019-06-14 17:33:16 UTC

(In reply to Eric Gallager from comment #9)
> * the "not from the expansion of <iso646.h>’s xor macro" criterion I can see
> possibly being a difficulty, due to how many other bugs there are about
> gcc's handling of macros from system headers...

That's not relevant to C++ because xor is a keyword not a macro.

Comment 11 David Malcolm 2019-06-14 17:35:28 UTC

Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree with comment #6).

Not sure what to call it: "-Wexclusive-or"???

I think we'd want to *not* warn if either of the operands are from a macro expansion.

I think both operands ought to be decimal integers to trigger the warning.

I like the wording from comment #2: "2 ^ 30 is 28, not 1073741824.", to make it clear what's going on (I hope).

Other idea: fix-it hints.

So maybe something like:

t.c:10:5: warning: '2^30' is 28; did you mean '1<<30' (1073741824) [-Wexclusive-or]

  log.Infof("Setting total memory to %.2f GB", float64(args.TotalMem)/(2^30))
                                                                       ~^~~
                                                                       1<<30

or somesuch.

Comment 12 Jonathan Wakely 2019-06-14 17:42:07 UTC

(In reply to David Malcolm from comment #11)
> Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree
> with comment #6).
> 
> Not sure what to call it: "-Wexclusive-or"???

I suppose -Wxor is a bit cryptic-lookin'

What about -Wxor-used-as-pow ?

> I think we'd want to *not* warn if either of the operands are from a macro
> expansion.
> 
> I think both operands ought to be decimal integers to trigger the warning.

And not warn if the C++ 'xor' keyword is used, as nobody's going to think that "2 xor 8" means raising to the 8th power.

> I like the wording from comment #2: "2 ^ 30 is 28, not 1073741824.", to make
> it clear what's going on (I hope).

Yes, that's also how I phrased the various pull requests and bug reports I've submitted today.

Comment 13 Zack Weinberg 2019-06-15 02:30:17 UTC

Since examples of this error were observed with base 10, I think the warning should cover 10^i for decimal literal i, too.

Relatedly, “note: ^ performs exclusive or, not exponentiation” might be a nice addition to the existing error for ^ with a float for either operand.

Comment 14 Richard Biener 2019-06-17 08:57:11 UTC

(In reply to David Malcolm from comment #11)
> Warning for "2 ^ INT" seems reasonable, maybe just for that (I think I agree
> with comment #6).
> 
> Not sure what to call it: "-Wexclusive-or"???

-Wexp[onential] or -Wpow[er]?

Comment 15 Thorsten Glaser 2019-06-17 14:19:45 UTC

-Wexp sounds like experimental

Comment 16 Eric Gallager 2019-06-17 16:25:07 UTC

I think David's original suggestion of -Wexclusive-or is the best name so far.

Comment 17 Marc Glisse 2019-06-17 17:27:50 UTC

(In reply to Jonathan Wakely from comment #12)
> What about -Wxor-used-as-pow ?

-Wxor-power (or -Wpower-xor)?

Comment 18 Dávid Bolvanský 2019-06-17 17:51:56 UTC

-Wxor-as-pow ? :)

Comment 19 Dominik 'disconnect3d' Czarnota 2019-06-17 18:26:38 UTC

Also what if:

1. Someone does it through DEFINEs as in:
```
#define COMPUTING_BASE 2
#define BITS 32
// and later use
COMPUTING_BASE ^ (BITS-1)
```
I guess we will warn.

2. Someone does it through constexpr variables in C++, as in:
```
constexpr int COMPUTING_BASE = 2;
constexpr int BITS = 32;
// and later use
COMPUTING_BASE ^ (BITS-1)
```
This probably happens on a different level than the above, so we probably won't warn as it doesn't use integer literals?

3. Someone *really wants it*?
Maybe there should be a way to inform the compiler, e.g. via a comment to suppress the warning for a given line? For example:
```
printf("%d\n", 2^32 /* explicit-xor */);
```

Offtopic: if you care about adding more warnings in tragic situations, you might also want to look at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88000 ;)

Comment 20 Dávid Bolvanský 2019-08-16 21:06:20 UTC

Clang implemented [0] this diagnostic under -Wxor-used-as-pow.
From user perspective it would be reasonable if GCC follows this naming.

[0] https://reviews.llvm.org/D63423

Comment 21 Eric Gallager 2019-11-16 05:38:03 UTC

(In reply to Dávid Bolvanský from comment #20)
> Clang implemented [0] this diagnostic under -Wxor-used-as-pow.
> From user perspective it would be reasonable if GCC follows this naming.
> 
> [0] https://reviews.llvm.org/D63423

ok I retract my previous support for -Wexclusive-or and now back -Wxor-used-as-pow instead.

Comment 22 David Binderman 2020-08-29 13:12:07 UTC

clang only seems to warn for 2 ^ X and 10 ^ Y.

There seems to be about 30 cases of this problem across the Fedora Linux
distribution, so not the biggest problem in the world.

Comment 23 Eric Gallager 2021-11-29 00:23:28 UTC

putting -Wxor-used-as-pow in the title since that's what clang went with

Comment 24 David Malcolm 2022-08-10 16:36:58 UTC

I'm working on an implementation of this.

Comment 25 David Malcolm 2022-08-11 14:29:28 UTC

Created attachment 53435 [details]
v1 of a patch to implement -Wxor-used-as-pow

This patch implements the warning, but doesn't work well; as noted in the text it's implemented in the parser, when I think it might have to be implemented in the lexer.

Attaching it here for reference (and as a backup for my hard drive).

Comment 26 David Malcolm 2022-08-12 01:40:40 UTC

I implemented a better version of the patch; I've posted it for review here:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599609.html

Comment 27 GCC Commits 2022-09-02 22:36:04 UTC

The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:bedfca647a9e9c1adadd8924f3ee0ab4189424e0

commit r13-2386-gbedfca647a9e9c1adadd8924f3ee0ab4189424e0
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Fri Sep 2 18:29:33 2022 -0400

    c/c++: new warning: -Wxor-used-as-pow [PR90885]
    
    PR c/90885 notes various places in real-world code where people have
    written C/C++ code that uses ^ (exclusive or) where presumbably they
    meant exponentiation.
    
    For example
      https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search
    currently finds 11 places using "2^32", and all of them appear to be
    places where the user means 2 to the power of 32, rather than 2
    exclusive-orred with 32 (which is 34).
    
    This patch adds a new -Wxor-used-as-pow warning to the C and C++
    frontends to complain about ^ when the left-hand side is the decimal
    constant 2 or the decimal constant 10.
    
    This is the same name as the corresponding clang warning:
      https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow
    
    As per the clang warning, the warning suggests converting the left-hand
    side to a hexadecimal constant if you really mean xor, which suppresses
    the warning (though this patch implements a fix-it hint for that, whereas
    the clang implementation only has a fix-it hint for the initial
    suggestion of exponentiation).
    
    I initially tried implementing this without checking for decimals, but
    this version had lots of false positives.  Checking for decimals
    requires extending the lexer to capture whether or not a CPP_NUMBER
    token was decimal.  I added a new DECIMAL_INT flag to cpplib.h for this.
    Unfortunately, c_token and cp_tokens both have only an unsigned char for
    their flags (as captured by c_lex_with_flags), whereas this would add
    the 12th flag to cpp_tokens.  Of the first 8 flags, all but BOL are used
    in the C or C++ frontends, but BOL is not, so I moved that to a higher
    position, using its old value for the new DECIMAL_INT flag, so that it
    is representable within an unsigned char.
    
    Example output:
    
    demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow]
        5 | int t2_8 = 2^8;
          |             ^
          |            --
          |            1<<
    demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2)
        5 | int t2_8 = 2^8;
          |            ^
          |            0x2
    demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow]
       21 | int t10_6 = 10^6;
          |               ^
          |             ---
          |             1e
    demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10)
       21 | int t10_6 = 10^6;
          |             ^~
          |             0xa
    
    gcc/c-family/ChangeLog:
            PR c/90885
            * c-common.h (check_for_xor_used_as_pow): New decl.
            * c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate.
            * c-warn.cc (check_for_xor_used_as_pow): New.
            * c.opt (Wxor-used-as-pow): New.
    
    gcc/c/ChangeLog:
            PR c/90885
            * c-parser.cc (c_parser_string_literal): Clear ret.m_decimal.
            (c_parser_expr_no_commas): Likewise.
            (c_parser_conditional_expression): Likewise.
            (c_parser_binary_expression): Clear m_decimal when popping the
            stack.
            (c_parser_unary_expression): Clear ret.m_decimal.
            (c_parser_has_attribute_expression): Likewise for result.
            (c_parser_predefined_identifier): Likewise for expr.
            (c_parser_postfix_expression): Likewise for expr.
            Set expr.m_decimal when handling a CPP_NUMBER that was a decimal
            token.
            * c-tree.h (c_expr::m_decimal): New bitfield.
            * c-typeck.cc (parser_build_binary_op): Clear result.m_decimal.
            (parser_build_binary_op): Call check_for_xor_used_as_pow.
    
    gcc/cp/ChangeLog:
            PR c/90885
            * cp-tree.h (class cp_expr): Add bitfield m_decimal.  Clear it in
            existing ctors.  Add ctor that allows specifying its value.
            (cp_expr::decimal_p): New accessor.
            * parser.cc (cp_parser_expression_stack_entry::flags): New field.
            (cp_parser_primary_expression): Set m_decimal of cp_expr when
            handling numbers.
            (cp_parser_binary_expression): Extract flags from token when
            populating stack.  Call check_for_xor_used_as_pow.
    
    gcc/ChangeLog:
            PR c/90885
            * doc/invoke.texi (Warning Options): Add -Wxor-used-as-pow.
    
    gcc/testsuite/ChangeLog:
            PR c/90885
            * c-c++-common/Wxor-used-as-pow-1.c: New test.
            * c-c++-common/Wxor-used-as-pow-fixits.c: New test.
            * g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress
            -Wxor-used-as-pow.
            * g++.dg/warn/Wparentheses-10.C: Likewise.
            * g++.dg/warn/Wparentheses-18.C: Likewise.
            * g++.dg/warn/Wparentheses-19.C: Likewise.
            * g++.dg/warn/Wparentheses-9.C: Likewise.
            * g++.dg/warn/Wxor-used-as-pow-named-op.C: New test.
            * gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress
            -Wxor-used-as-pow.
            * gcc.dg/Wparentheses-7.c: Likewise.
            * gcc.dg/precedence-1.c: Likewise.
    
    libcpp/ChangeLog:
            PR c/90885
            * include/cpplib.h (BOL): Move macro to 1 << 12 since it is
            not used by C/C++'s unsigned char token flags.
            (DECIMAL_INT): New, using 1 << 6, so that it is visible as
            part of C/C++'s 8 bits of token flags.
    
    Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Comment 28 David Malcolm 2022-09-02 22:43:02 UTC

Implemented for GCC 13 by the above patch; marking as resolved.

Comment 29 Jonathan Wakely 2022-09-03 02:01:20 UTC

Excellent! Thanks, Dave

Comment 30 warp 2023-04-27 15:39:07 UTC

Note that a few of the examples in that first tweet are actually misleading. More particularly, the example in png.c. At first glance it looks like C and an example of this mistake, but if you look at the context it becomes clear that it actually isn't C at all (because that line appears in a context where it would be illegal code, namely, inside the initialization list of an array). It's actually BC code embedded in the C source code, for some reason. In BC 2^32 is legitimately 2 to the power of 32.

The other examples are probably legitimate, though.