Russell King 87067a935a ARM: Optimize multi-CPU tlb flushing a little more
The compiler does not conditionalize the assembly instructions for
the tlb operations, which leads to sub-optimal code being generated
when building a kernel for multiple CPUs.

We can tweak things fairly simply as the code fragment below shows:

    17f8:       e3120001        tst     r2, #1  ; 0x1
...
    1800:       0a000000        beq     1808 <handle_pte_fault+0x194>
    1804:       ee061f10        mcr     15, 0, r1, cr6, cr0, {0}
    1808:       e3120004        tst     r2, #4  ; 0x4
    180c:       0a000000        beq     1814 <handle_pte_fault+0x1a0>
    1810:       ee081f36        mcr     15, 0, r1, cr8, cr6, {1}
becomes:
    17f0:       e3120001        tst     r2, #1  ; 0x1
    17f4:       1e063f10        mcrne   15, 0, r3, cr6, cr0, {0}
    17f8:       e3120004        tst     r2, #4  ; 0x4
    17fc:       1e083f36        mcrne   15, 0, r3, cr8, cr6, {1}

Overall, for Realview with V6 and V7 CPUs configured:

   text    data     bss     dec     hex filename
4153998  207340 5371036 9732374  948116 ../build/realview/vmlinux.before
4153366  207332 5371036 9731734  947e96 ../build/realview/vmlinux.after

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24 09:38:52 +00:00
..
2009-02-01 11:01:22 +05:30
2011-03-16 23:35:26 +00:00
2009-03-15 21:01:20 -04:00
2009-05-29 08:40:02 -07:00
2011-03-31 11:26:23 -03:00
2011-03-31 11:26:23 -03:00
2011-03-16 23:35:26 +00:00
2011-03-31 11:26:23 -03:00
2011-11-28 21:13:06 +02:00
2010-10-07 14:08:55 +01:00
2010-10-29 13:14:40 -05:00
2011-12-06 11:15:25 +00:00
2010-10-01 22:32:18 -04:00
2011-10-23 13:32:29 +01:00
2008-09-04 09:46:11 +01:00
2008-11-27 12:37:59 +00:00
2012-01-03 22:55:17 -05:00
2011-03-31 11:26:23 -03:00
2012-01-16 08:56:25 -06:00
2011-07-12 11:19:29 -05:00